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This book (if this can actually be called a book) began as a collection of handouts 
written by me for the first year undergraduate laboratories at University College Dublin 
(UCD), while I was demonstrating during the school year of 2009 — 2010. I realized that 
perhaps these handouts could be useful in the future, so in my spare time (primarily 
during free periods at a QCD phenomenology conference) I pulled all the source for the 
handouts together into this book. That being said, this book requires three disclaimers. 

The first disclaimer is that much of the material should be understandable by a first 
year physics student, but some of it can be very advanced, and perhaps not quite so 
appropriate. Hopefully I have managed to point these areas out in the text, so that first 
years reading this book don't panic. Of course, it is also possible that I have written 
incredibly difficult to understand explanations, in which case readers of this book should 
feel free to express their opinions to me. Of course I might not listen to those opinions, 
but I would like to try to make this book better, and the only way to do that is through 
revision. I do believe that all the material presented in this book should be accessible 
to intrepid first year physics students. 

The second disclaimer is that this book might be just as helpful for demonstrators as it 
is to undergraduates. As I was demonstrating I oftentimes wished that concise theoretical 
refreshers for some of the topics were available, instead of having to dig through a variety 
of text books buried under the dust of neglect. Just because demonstrators are typically 
postgraduates does not mean that we remember every last detail about the wavefunctions 
of a hydrogen atom, or BCS theory. Also, just because we are postgraduates doesn't 
mean that we necessarily have a good way to explain certain physics. Hopefully this 
book will serve both of these needs by providing a nice overview of the topic, and also 
present a possible method of teaching the material. 

The third, and hopefully final disclaimer, is that this book contains mistakes. Despite 
my best effort, I am certain this book still contains spelling mistakes, grammar mistakes, 
and worst of all, physics mistakes. That means that while reading this book, always 
double check my work. If something looks wrong, it could very well be wrong. If 
something looks right, it still might be wrong. If you find a mistake, please let me know, 
and I will do my best to fix it. The idea of this book is that it is a growing effort of the 
community to provide a useful resource to UCD students. To that end, the source to 
this book (written in DTfrjX with figures made using Inkscape and Octave), is available 
either through the web, or by contacting me. 

Enjoy and don't panic. 



- Philip Ilten 
philtenOlhcb . ucd . ie 
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1 Uncertainty 



Uncertainty estimation and propagation is sometimes more of an art than an exact sci- 
ence, but nevertheless is critical to the scientific process. Without uncertainty estimation 
we can never know how well our theory matches experimental reality. Quite a few books 
have been written over the years on uncertainty analysis, but one of the best is An In- 
troduction to Error Analysis by John R. Taylor. This book should be on the book shelf 
of every physicist, whether an experimentalist or a theorist. 

This chapter focuses on two main areas, the different types of uncertainty and their 
estimation, and the propagation of uncertainty. Before we can delve into either of these 
areas, we first need to define uncertainty. Whenever we make a measurement, the 
circumstances surrounding that measurement influence the value. For example, if we 
measure the value for gravity on earth, g, we will obtain a different value in Dublin than 
we would in Chicago or Paris. To indicate this, we must write our central value for g 
followed by a range in which g might fall. We call this our uncertainty. 

g = 9.81 ± 0.21 m/s 2 (1.1) 

central value uncertainty units 
(accuracy) (precision) 

How close our central value is to the actual value is the accuracy of the measurement, 
while the amount of uncertainty describes the precision of the measurement. If a mea- 
surement has a very good accuracy, but very low precision, is not very useful. Conversely, 
if a measurement has a very poor accuracy, but very high precision, the measurement 
is still not useful. The art of uncertainty is balancing accuracy and precision to provide 
meaningful measurements that help confirm or deny theories. 

There is one final issue that needs to be discussed regarding the format of writing 
uncertainty, and that is significant figures. The number of significant figures on a 
measurement is the number of meaningful digits. An uncertainty on a final measurement 
should never have more significant figures than the central value, and should in general 
have only one or two digits. The number of significant figures on the central value 
must always reach the same precision as the uncertainty. This allows the reader of the 
experimental data to quickly see both the accuracy and the precision of the results. 

1.1 Types of Uncertainty 

It can be difficult to classify uncertainty; there are many sources, and oftentimes the 
cause of uncertainty is unknown. However, we can broadly classify two types: system- 
atic uncertainty and random uncertainty. Systematic uncertainties are types of 



1 Uncertainty 

random uncertainty, but are caused by calibration within the experiment. First we will 
discuss the types of random uncertainty, and then use these to understand systematic 
uncertainty. 

Random uncertainty can be caused by a variety of sources, but these can be classified 
in three general areas. The first is apparatus uncertainty. This uncertainty arises 
when the design of the experiment limits the precision of the measurements being made. 
For example, consider trying to measure the pressure of a basketball, using a pressure 
gauge. On cold days the pressure gauge might leak because the gaskets have contracted, 
while on warm days the pressure gauge might not leak at all. Taking measurements from 
different days will yield a range of results caused by this uncertainty. In this example 
the measurement from the warm days are more reliable, as there is no leaking, but in 
some experimental situations this is not so readily apparent. 

The next type of random uncertainty is inherent uncertainty. Some measured 
quantities just are different each time they are measured. This oftentimes is found in 
the biological sciences, especially in population analysis. When measuring the average 
weight of all deer in Ireland, we don't expect to find each deer has the same weight, but 
rather that the weights are spread over a range of values. This means if we wish to quote 
the average weight of a deer in Ireland, we will need to associate an uncertainty with it. 
Another place in physics where inherent uncertainty is found is in particle physics and 
quantum mechanics. Here, the value of the measurement isn't decided until the observer 
actually makes the measurement. 

The final and most common type of random uncertainty is instrumental uncer- 
tainty. Here the instruments being used to take the measurement have a limited pre- 
cision, and we can only quote the value to the precision of the instrument. In many of 
the labs done in this course, the primary source of uncertainty will be from instrumen- 
tation. An example of instrumental uncertainty is reading the voltage from a circuit 
using a multimeter. The multimeter can only read to 0.1 V, and so our uncertainty in 
the voltage must be ±0.1 V. Similarly, if we are measuring a length with a ruler, the 
instrumentation of the ruler limits are precision to usually a millimeter, and so we have 
an uncertainty of ±1 mm. 

Sometimes all three types of random uncertainty can be combined. If we return to the 
voltage example, consider what would happen if the reading on the multimeter fluctuated 
between 2 and 3 V. Now our apparatus uncertainty (caused by fluctuations in the power 
supply, or for some other reason) is larger than our instrumental uncertainty, and so now 
our uncertainty is ±0.5 V instead of ±0.1 V. The important point to remember about 
random uncertainty is that we never know which direction the uncertainty is. In the 
voltage example we could measure a value of 2.5 V, but not be certain if that voltage 
was actually 2 V, 3 V, or any values in between. 

The second broad category of uncertainty, systematic uncertainty, is caused by a 
calibration error in the experiment. Let us return to the basketball example, where 
we are trying to measure the pressure of a basketball. When we read the pressure gauge 
while it is not attached to the basketball we obtain a value of bar. This is because 
the gauge is calibrated incorrectly; we need to account for atmospheric pressure which is 
around 1 bar. If we want to correct all of our pressure measurements from different days, 
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we must add something around 1 bar to all the measurements. The only problem is that 
the atmospheric pressure changes from day to day, and so we have an associated random 
inherent uncertainty on 1 bar of around ±0.1 bar. We call this random uncertainty, 
from a calibration adjustment of the data, the systematic uncertainty. When we quote 
systematic uncertainty we add another ± symbol after the random uncertainty. Let us 
say that we have measured the basketball pressure to 1.3 ±0.2 bar without our calibration 
adjustment. Now, when we add on the atmospheric pressure we quote the measurement 
as 2.3 ±0.2 ±0.1 bar. 

1.2 Propagating Uncertainty 

While we now know how to estimate uncertainty on individual experimental measure- 
ments, we still do not know how to propagate the uncertainty. If we have measured the 
length, L, and width, W, of a rectangle of paper, and we have an associated uncertainty 
on each measurement, what is the uncertainty on the area, A, of the paper? Finding 
this from our uncertainties on L and W is called propagation of uncertainty. 

In Figure 1.1 we graphically show how we calculate the uncertainty. In this example 
we have measured L = 5.0 ±0.5 cm, and W = 2.0 ± 1.0 cm (perhaps we used a very poor 
ruler for measuring the width). As a matter of notation we often represent uncertainty 
with the the Greek letter 5 (a lower case delta), or sometimes a (the Greek letter sigma). 1 
We calculate our central value for A by, 

A = LxW (1.2) 

which is the formula for the area of a rectangle, shown in Figure 1.1(a). At first glance, 
propagating our uncertainty to A might seem simple; we can just multiply the uncer- 
tainties, just as we multiplied the central values to obtain A. 

Figure 1.1(b) shows why this does not work. The red rectangle is still the central value 
for A, 10.0 cm 2 . The dashed purple rectangle is the smallest possible value for A we can 
obtain within the range of uncertainty for L and W, while the dashed blue rectangle is 
the largest possible value for A we can obtain. If we just multiply 5l and 8w we obtain 
the small dashed rectangle in red. This uncertainty is clearly much too small. 

The possible values for the area of the rectangle range from the purple rectangle to 
the blue rectangle. These are the extremum (maximum and minimum) of the area, and 
by taking their difference and dividing by two, we can find the average range about the 
central value for A. 



(1.3) 



We use a when referring to uncertainty taken from a large number of measurements as it refers to the 
standard deviation, which is explained more later on in this chapter. 
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Figure 1.1: Geometric representations for the different methods of propagating uncer- 
tainty for the area of a rectangle. 
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1.2 Propagating Uncertainty 

Plugging in our values for L, W, 5l, and 8\y, we arrive at 5a = 6.2 cm 2 , a much 
larger uncertainty than the incorrect 5l x <W = 0.50 cm 2 . This gives us a value for A of 
10.0 ± 6.2 cm 2 ; the actual area for the rectangle could range anywhere from around 4 to 
16 cm 2 . Notice how the upper and lower bounds match nicely with the blue and purple 
rectangles respectively. 

The uncertainty propagation method outlined above can be applied to any formula and 
is called the extremum uncertainty method. This is because to find the uncertainty 
on the calculated quantities, we find the largest value for the quantity possible, and the 
smallest value possible, take the difference, and divide by two. The tricky part of this 
method is finding the maximum and the minimum values for the calculated quantity. 
In the example above it is easy to visualize, as we see the rectangle is the largest when 
we add uncertainty onto L and W, and smallest when we subtract the uncertainty. 
But what happens when we look at a more complicated function? Let us consider the 
following function f(x, y, z) which is dependent upon the measured quantities x, y, and 
z (analogous to L and W in the example above). 

f = x 2 -2y + - z (1.4) 

Now we need to see what happens to / when we change either x, y, or z. For x we 
see that / is maximized whenever x is as large as possible (both positive and negative) 
and minimized for x near zero. For y, we see / is maximized for large negative y and 
minimized for large positive y. For z, the behavior of / is even trickier. Large positive 
and negative z make 1/z very close to zero. Very small positive values of z make 1/z a 
very large positive number, and very small negative values of z make 1/z a very large 
negative number. From this behavior we can see that to maximize /, we want z to be 
as close to zero as possible while still being positive. To minimize /, we want z to be as 
close to zero as possible while being negative. 

From the example above it is readily apparent that the extremum method for propagat- 
ing uncertainty can quickly become very complicated, and also a little tedious. Luckily, 
in some cases we can bypass the extremum method and propagate the uncertainty using 
relative uncertainty. Relative uncertainty is the uncertainty on a quantity 5 X divided 
by that measurement x, i.e. 8 x /x. If we return to the rectangle example of Figure 1.1, 
we see that the relative uncertainty on W is 0.50 (or 50%), and the relative uncertainty 
on L is 0.10 (or 10%). 

If one relative uncertainty for a measurement is much larger than the relative uncer- 
tainties for the other measurements, we can just focus on the largest relative uncertainty, 
and assume that our calculated quantity will have approximately the same relative un- 
certainty. In this example the relative uncertainty on W is much larger than the relative 
uncertainty on L, and so we assume that A will have a relative uncertainty of approxi- 
mately 0.50. 

Looking at Figure 1.1(c) we can see this method in action. Now we have A = 10.0 ± 5.0 
cm 2 which is very close to the extremum method which supplied an uncertainty of 
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10.0 ± 6.2 cm 2 . The uncertainty from the extremum method is larger of course, as with 
the relative method we are ignoring our uncertainty on L. We can also apply this to any 
general function / dependent upon multiple measurements, but with the main source of 
relative uncertainty from the variable x. 

Notice that this method is much faster (and simpler) than the extremum method, but is 
only valid when the relative uncertainty on x is much larger than the relative uncertainty 
on the other variables. 

There is one final method for propagating uncertainty, the normal uncertainty 
method, which is the most common method used in physics. However, understanding 
this method can be a bit challenging, and understanding when and when not to use 
this method is not trivial. The remainder of this chapter is devoted to attempting to 
explain the motivation behind this method, but here, a brief example will be given just 
to demonstrate the method. 

If we have a function f(xi,X2, ■ ■ ■ , x n ) dependent upon x\ up to x n independent mea- 
surements (or more generally, measurements xi), we can propagate the uncertainty as 
follows. 



Here we have replaced our traditional 5 letter for uncertainty with the letter a where u\ 
corresponds to the uncertainty associated with measurement x\. The symbol d denotes 
a partial derivative; this is where we keep all other variables in / constant, and just 
perform the derivative with respect to x\. 

We can now use this method of propagation on the rectangle example of Figure 1.1. 
First we must apply Equation 1.7 to Equation 1.2. 

2 __ fdA(L,W)\ 2 2 , (A(L,W)\ 2 2 



= L 2 o 2 w + W 2 a\ 

Plugging in the uncertainties we arrive at a value of A — 10.0 ± 5.0 cm 2 , the same 
uncertainty we arrived at using the relative uncertainty method! 

1.3 Probability Density Functions 

In the previous sections we have seen what uncertainty is, and how it is possible to prop- 
agate it using two different methods, the extremum method, and the relative uncertainty 
method. However, the two methods described in detail above are very qualitative and 
do not work well when a single value is repeatedly measured. As an example, let us 
consider a race car driving around a track, and trying to measure the velocity each time 
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1.3 Probability Density Functions 

it passes. The race car driver is trying to keep the velocity of the car as constant as 
possible, but of course this is very difficult. Therefore, we expect that the velocities we 
record for each lap will be similar, but not exactly the same. If we make a histogram 2 
of these values and divide the histogram by the number of measurements we have made, 
we will have created a probability density function for the velocity of the race car. 

This histogram describes the probability of measuring the velocity of the car to be 
within a certain range of velocities. For example we can find the area under the entire 
histogram, which should return a value of one. This tells us that we always expect 
to measure a velocity within the ranges we have previously measured. All probability 
density functions when integrated from negative infinity to infinity should yield an area 
of one; when the area under any curve is one we call the curve normalized. Similarly, 
if we want to find the probability of measuring a certain velocity instead of a range of 
velocities, we see the probability is zero. This is because we do not expect to be able to 
measure an arbitrarily precise value for the velocity of the car. 

Probability density functions can be described by a variety of properties, but two of 
the most important are the mean and the variance. The mean for a probability density 
function is exactly the same as the mean average taught in grade school, and is often 
denoted by the Greek letter u. (spelled mu) . 

Before we can understand exactly what the variance of a probability density function 
is, we need to introduce the expectation value of a probability density function. If we 
have a probability density function PDF(x) dependent upon the variable x we define 
the expectation value for a function of x, f(x), as 

/oo 
f(x)PDF{x)dx (1.9) 

-oo 

where we just integrate the instantaneous probability of x, PDF(x), times the function of 
x for which we are trying to find the expectation value, f(x). The mean of a probability 
density function is just the expectation value of the function fix) = x, E [x]. In other 
words, we expect that if we measured the variable x many times, we would find an 
average value of E [x] . 

The variance of a probability density function dependent on the variable x is, 

a 2 ( x ) = E \(x - iif] = E(x 2 ) - a 2 (1.10) 

or the expectation value of (x — fi) 2 and is denoted by the symbol a 2 or sometimes the 
letters Var. The standard deviation of a probability density function is just the square 
root of the variance. 

a(x) = y/^(xj (1.11) 

The standard deviation, given by the Greek letter a (spelled sigma), measures how 
far most measured values for a variable x deviate from n, the mean of x. When the 
uncertainty for a measurement with a known probability density function is quoted, the 
uncertainty is usually just one standard deviation as calculated above. 



2 A histogram is just a bar graph that plots the number of events per value range of that variable. 
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Figure 1.2: The normal probability density function of Equation 1.12 is given with one 
standard deviation in light blue, two standard deviations in dark blue, three 
standard deviations in purple, and four standard deviations in red. 



1.4 Normal Uncertainty 

In physics the probability density functions of most measurements are described by the 
normal distribution. 3 The formula for the normal distribution is, 



PDF(x) 



1 



V2vrcr 2 






1.12) 



and is plotted in Figure 1.2. Here a is the standard deviation of the curve, and \x the 
mean as defined in the previous section. For a normal distribution, 68% of all measured 
values of x are expected to fall within a of /j,, while 95% are expected to fall within 2a. 
When a physics measurement is stated with an uncertainty, the uncertainty is assumed 
to represent one standard deviation of the data, unless indicated otherwise. Because 
most measurements in physics are described by the normal distribution, 68% of the 
values measured by the experimenter fell within this uncertainty range. 



J This distribution also goes under the names of Gaussian distribution, normal curve, or bell curve. 
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1.4 Normal Uncertainty 

But this leads to the question, why are most physics measurements described by a 
normal distribution? The reason for this is what is known as the central limit theorem. 
The central limit theorem can be interpreted many ways, but dictates that under the 
correct initial conditions, most probability density functions when sampled many times 
converge to the normal distribution. 

The above statement of the central limit theorem is very general and not the most 
intuitive to understand, so it may be more helpful to illustrate a consequence of the 
central limit theorem. Let us consider a box which we fill with different colored marbles. 
First we place N r red marbles into the box, where N r is a random number chosen from 
a uniform distribution between and fO. In a uniform distribution each number is 
equally likely to be picked, and so the distribution is just a rectangle from to 10 with 
area 1. We have chosen this distribution as it is clearly not a normal distribution. 

After placing the red marbles into the box, we count the number of marbles in the 
box, and record this number for this first trial as N r . Next we add N orange marbles, 
where again we determine the random number N from a uniform distribution between 
and 10. The number of marbles in the box will now just be N r + N . We record this 
value for our first trial as well. Next we add a random number of yellow marbles, N y , 
between and 10 and again record the total number of marbles in the box. We continue 
this experiment by adding green, blue, and purple marbles in the exact same fashion to 
the box, recording the total number of marbles in the box after each new color is added. 

We now perform this experiment thousands of times and tabulate the number of 
marbles after each step of adding a new color. We take all these numbers and make a 
histogram for each step. In Figure 1.3 we have simulated the experiment 2000 times. 
The histogram for the first step is a uniform distribution, as we expect. This distribution 
just tells us the probability of finding N r red marbles in the box after each step which 
is 1/11. 

In the histogram for the second step, something unexpected happens. Here we have 
added the uniform distribution for the red marbles with the uniform distribution for the 
orange marbles, and have obtained a non-uniform distribution for the total number of 
marbles in the box! This is a direct consequence of the central limit theorem. As we 
add on more uniform distributions from the yellow, green, blue, and purple marbles, the 
distribution for the total number of marbles in the box after each step looks more and 
more like a normal curve. The result is striking. What began as a flat distribution now 
resembles a normal distribution; the total number of marbles in the box, after adding 
all the colors, is normally distributed! 

There are a few important features to notice about the steps performed in Figure 1.3. 
The first is that the mean of the distribution, fj,, is just the addition of the means of 
the component distributions. For example, we know that on average we will pick 5 red 
marbles, 5 orange marbles, etc. Subsequently in the first histogram the mean for the 
histogram is fx r , while in the second histogram the mean is /i r +fi . The second important 
point to notice is that for each histogram the variances add just like the means. The 
variance for a discrete uniform distribution is just, 

^ = -12- (L13) 
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Figure 1.3: An illustration of the central limit theorem. The black curve is the normal 
curve the distribution is approaching. 
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where TV is the number of discrete values available. The distribution in the first histogram 
of Figure 1.3 is sampled between and 10, and so with N = 11 the variance is 10. 
Subsequently the following distributions have variances of 20, 30, 40, 50, and 60. In 
Figure 1.3 the values given for /x and o~ 2 do not match exactly what is written above. 
This is because the histograms were made by simulating the experiment outlined above 
2000 times. 4 This is similar to when we flip a coin 10 times; we don't expect exactly 5 
heads and 5 tails, but instead numbers near 5. 

From the example above, we can see the power of the central limit theorem. If we 
think of taking measurements in physics as adding together many different probability 
density functions (like in the marble example), we see that the end result is a normal 
distribution. Whether this approximation is valid or not depends upon the situation, 
but in general, normal distributions model physics measurements well. 

1.5 Normal Uncertainty Propagation 

Let us assume that we have experimentally measured n different variables, x%, in an 
experiment. Also, let us assume that for each measured variable we have taken a large 
number of data points, N, and verified that the data points for each variable are normally 
distributed with a variance of of. Now we wish to calculate the standard deviation, a/, 
for a function dependent upon the measured variables, f(x±, X2, • • • , x n ). 

We can't just add the standard deviation of the variables together, as we have seen in 
the previous sections. Instead we must use normal uncertainty propagation, which was 
briefly demonstrated earlier. We will begin by looking at the full propagation method 
and then trying to understand it, first through an intuitive argument and then through a 
more rigorous proof. The full formula for propagating normal uncertainty is given below 
in all of its glory. 
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(1.14) 



This is what we call a Monte Carlo experiment. We use a random number generator with a computer 
to simulate the experiment. 
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1 Uncertainty 

Needless to say the above method for propagation looks very nasty, and it is. The 
matrix in the middle of the equation, consisting of cr's and p's is called the variance- 
covariance matrix and essentially relates the correlations of all the variables x\. The 
coefficient pij is defined as the correlation coefficient and represents how strongly the 
variables x\ and Xj are correlated. For variables that are completely correlated (i.e. they 
are the same variable) pij is just one. For the case that the variables are completely 
uncorrelated pij is just zero. The general form of pij is, 

E \{xi - Pi) (xj - pj)] 
Pi,j — \ L - L0 ) 

UiUj 

where E is the expectation value explained above, and pi is the mean value of the 
variable 2j. 

Luckily for us, the majority of experiments in physics consist of measuring independent 
variables, variables where pij = 0, and so Equation 1.14 is greatly simplified. 



„-, = l ^,\* + (?L)*4 + ... + (%L)'el 



^(^Hh^^-h^ 1 - 2 



This unfortunately is not a very intuitive equation, and so a bit more explanation is 
necessary. 

Let us first consider a simple example where we have one variable x\ and a function, 
/(xi), dependent upon only xi. We can make a plot of xi on the x-axis and /(xi) on 
the y-axis. Let us choose a specific xi and label this point p. All points of xi have an 
associated uncertainty of <j\ and so we know the uncertainty for p is <j\. Looking at 
Figure 1.4(a) we see that our uncertainty <j\ is in the x-direction, and that we want <jj 
which should be in the y-direction. 

To find Of we can just use simple geometry. It seems reasonable that at point p the 
change in f{x\) over the change in Xj should equal the change in the uncertainty on 
f(xi) over the change in uncertainty on x\. 

A/(si) <tf(xi) (Tf dfjxi) 

7 = ; = => (Tf — : <7l ( 1.17 ) 

Ax i dx\ o\ dx\ 

In the first step we have taken the limit of Af(xi)/Axi as Axi grows very small which 
just gives us the derivative, or the slope exactly at point p rather than in the general 
vicinity of p. In the next step we set this equal to the change in uncertainty and in the 
final step we just solve for Of. 

Now we have found how to propagate the uncertainty for just one variable x±, but 
we need to be able to do this for n variables x%. The first change we need to make 
is substitute the derivative of Equation 1.17 with a partial derivative. 5 The second 
change we need to make is how we think of our uncertainty. The uncertainties don't just 



For those not familiar with a partial derivative, we denote it with the symbol d. To take a partial 
derivative such as df(x,y,z)/dx just differentiate f(x,y,z) with respect to x and think of y and z 
as constants. 
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Figure 1.4: Figure 1.4(a) geometrically illustrates how the uncertainty Of is propagated 
from an uncertainty o\. Figure 1.4(b) demonstrates how the components of 
uncertainty must be added in quadrature. 
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1 Uncertainty 

add linearly like numbers, but rather are components of an uncertainty vector. We don't 
care about the direction of the uncertainty vector but we do care about the magnitude of 
the vector, as this gives us our uncertainty on f(xi). To find the magnitude of a vector, 
we just take the square root of the sum of all the components squared. This process is 
called adding in quadrature. 
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(1.18) 



For those readers not familiar with taking the magnitude of a vector, think of the 
Pythagorean theorem where we find the length of the hypotenuse of a triangle, c, from 
the sides of the triangle, a and b, by the formula c = Vo 2 + b 2 . In Figure 1.4(b) we are 
now considering an example where n = 2 and / is dependent on two variables, x\ and 
X2, with uncertainties of 0\ and Oi- Using Equation 1.17 we replace sides a and b with 
the x\ and £2 components of o/, and the hypotenuse, c, with <jf. Using Pythagoras' 
theorem we arrive back at Equation 1.18. 

Unfortunately, Equation 1.18 only matches Equation 1.14 when pij = 0, or when the 
variables Xi are independent of each other. To understand how we can introduce the 
Pij terms we must leave the geometric derivation for uncertainty propagation outlined 
above and turn to a more mathematically rigorous derivation. The math here can get a 
little complicated, but is given for the curious. 

A Taylor series is an expansion of a function / about a certain point a and is given 
by an infinite sum. For the n dimensional case the function f(x\, X2, • • • , x n ) is expanded 
in infinite sums about the points 01, 02, • ■ • , a n - 



r(/(xi,...,x n ))= J2 ••• E 



mi=0 m n =0 



<9 mi+ - +m "/(xi,...,x n ) 
d m i Xl ...Qm nXn 

(Xi - ai) ■■■{x n - On) 
m\\ ■ ■ ■ 171,2! 



(1.19) 



Let us assume now that we have made N measurements and that for the N mea- 
surement we have the measured values X\^ through x n) N- We can now expand our 
function f(x\, . . . , x n ) about its average /j,f for each measurement N, and truncate the 
expansion at first order (i.e. we only look at terms with first derivatives). The Taylor 
series for measurement iV to first order is as follows. 
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(x n ,N ~ Mn) 



Using our definition for variance given in Equation 1.10 in combination with the limit 
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1.5 Normal Uncertainty Propagation 



of a discrete version of Equation 1.9, we can then write the variance of /. 

1 N 
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Plugging Equation 1.20 into the above we arrive at, 
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(1.22) 



dx n -ij \dx n 

where we have expanded the square. We can break this into the individual sums, and us- 
ing the definition of variance along with a discrete definition of the correlation coefficient 
given in Equation 1.15, recover Equation 1.14! 
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(1.23) 
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2 Plots 



The old adage "a picture is worth a thousand words" is true for many things but could 
not be more relevant to the field of physics, although perhaps the saying should be 
changed to "a plot is worth a thousand words" . Plots quickly and easily allow readers to 
assimilate data from experiment, compare data to theory, observe trends, the list goes 
on and on. Because plots are so important in physics, it is critical that plots are made 
correctly. This chapter attempts to cover the basics of plotting, fitting data, and the 
methods behind fitting of data. 

2.1 Basics 

Figure 2.1 gives an example of a well-made plot. The first point to notice about Figure 2.1 
is the title. Every plot should have a title; this allows the reader to quickly understand 
what the plot is attempting to show, without having to read pages of accompanying 
text. Sometimes if the figure is accompanied by a caption the title is neglected, but 
in general, plots should always have titles. Additionally, the title should be relatively 
short, but also convey meaning. In this example the title is "Velocity of an Electron 
in a Magnetic Field". The meaning here is clear, we expect to find information within 
the plot pertaining to the velocity of an electron as it moves through a magnetic field. 
Notice that titles such as "Graph 1" are not helpful. Such a title tells us absolutely 
nothing about the content of the plot. 

The next items to notice about the plot in Figure 2.1 are the labels for the x-axis and 
y-axis. It is important that the labels are clear about what is being plotted along each 
axis. In this plot the product of two quantities j3 and 7, both calculated from the velocity 
of an electron is plotted on the y-axis while the intensity of a surrounding magnetic field 
is plotted on the x-axis. The labels clearly and concisely summarize this information 
without being ambiguous. Each label is followed by square brackets filled with the units 
of the axis. Notice that the quantity /?7 is unitless and so the label is followed with the 
word "unitless" in square brackets. While this is not necessary, it informs the reader the 
plotted quantity has no units, and a unit label was not just forgotten. Without units it 
is nearly impossible to guess what information the plot is trying to depict. 

In the example of Figure 2.1 an electron is passing through a magnetic field and its 
velocity is being measured using a velocity selector. 1 From the plot it is clear that the 
magnetic field is being adjusted, and for each adjustment in the field the velocity of the 
electron is measured and the value ,$7 calculated. We know this because traditionally 



This is an actual experiment using real data. The experiment was performed at MIT's Junior Lab, 
and the full write up is available at http://severian.mit.edu/philten/physics/dynamics.pdf. 
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Velocity of an Electron in a Magnetic Field 
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Figure 2.1: An example of a well scaled plot with a title, axis labels, units, a legend, 
theory curves, a curve of best fit (with associated reduced x 2 )-, an d data 
points with error bars. 



the variable quantity is plotted along the x-axis while the measured quantity is plotted 
along the y-axis. Similarly, for three dimensional plots the measured quantity is plotted 
on the z-axis, while the variable quantities are plotted along the x and y-axis. 

Now that we have discussed all the important aspects of labeling we can focus on the 
actual contents of the plot, but it is important to remember without proper labels the 
content of a plot is meaningless. Looking at the contents of Figure 2.1 we see three lines, 
a blue line, a green line, and a red line along with black data points. When multiple 
items are plotted, a legend, given in the left hand corner of the plot, is a necessity. 
A quick glance at the legend tells us that the blue line is a theoretical prediction made 
using classical mechanics, the green line is a theoretical prediction made using relativistic 
mechanics, the red line is a line of best fit, and the black data points are the data from 
the experiment. 

Above and below each data point a line is vertically extended; this is called an error 
bar which represents the uncertainty on the data point. For example, the first data point 
at B = 80 Gauss has a measured value of 0.981 with an uncertainty of 0.073 and so the 
lower error bar extends to a value of 0.908 while the upper error bar extends to a value 
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2.2 Fitting Data 

of 1.054. In this experiment the uncertainty was arrived at using normal uncertainty, 
and so we know that 68% of the experimenter's measurements fell within the range given 
by the error bars, or in this case the first measurement can be written as 0.981 ± 0.073. 2 
Uncertainty can be displayed along the x-axis, and can also be simultaneously displayed 
along both x and y, although traditionally uncertainty is propagated so that it is only 
displayed along the y-axis. Additionally, sometimes an error band is used instead, 
which is just a continuous version of error bars. 

2.2 Fitting Data 

One important part of Figure 2.1 that was not discussed in the previous section were 
the blue, green, and red lines. Lines such as these, especially the red line, can be found 
in most scientific plots as they help the reader understand how well different theories 
match with the experimental data. Before we can fully explain the plot we need a small 
amount of theory. In classical (or Newtonian) mechanics, the mechanics taught in this 
course, the velocity of an object has no upper limit. Einstein, at the beginning of the 
20 th century postulated that this actually is not true, and that very fast moving objects 
(nothing that we will observe in the lab) cannot exceed the speed of light. 

In the creation of this theory Einstein introduced two new quantities that can be 
calculated from the velocity of an object. The first, 

»'\ (2.1) 

is represented by the Greek letter (3 (spelled beta), and is just the velocity of an object 
divided by the speed of light in a vacuum, c. Notice that according to Einstein because 
v < c, (3 must always be less than or equal to 1. The second quantity, 

7 = — (2.2) 

is called the Lorentz 7- factor is and is represented by the Greek letter 7 (spelled gamma). 
Without getting bogged down in details, the data for Figure 2.1 was gathered by firing 
electrons at a very high velocity through a magnetic field and recording their velocity. 
Using both classical and relativistic theory it is possible to predict (3 for the electron 
(the subscript c designates classical theory, and the subscript r relativistic theory). 



ft 



m e cr 

1 (2-3) 

/ „\ 2 

1 + 



pBe 



Here the quantity p is a physical constant of the experiment. The quantity e is the 
fundamental charge of an electron, m e the mass of an electron, and c the speed of 



"See Chapter 1 for more details on how uncertainty is estimated and propagated. 
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light. The quantity B is the strength of the magnetic field through which the electron is 
traveling, and is the variable in this experiment which we know already, as B is plotted 
on the x-axis of Figure 2.1. 

We can find 7 for both classical and relativistic theory by plugging the respective 
values for (3 given in Equation 2.3 into Equation 2.2. From Equation 2.3 we notice that 
classical theory predicts that (3 is given by a linear relationship in B. We can write a 
general linear relationship between y and x in the slope-intercept form of 

y = mx + b (2.4) 

where m is the slope of the line and b is the y-intercept of the line. By definition the 
y-intercept is where the line crosses the y-axis, which occurs when x is zero. 

In general, physicists like working with linear relationships. They are easier to fit 
than complicated curves and oftentimes are able to provide just as much information. 
The only problem we have with Equation 2.3 is that we expect relativistic theory to be 
correct, not classical theory, and it is clear that (3 is not given by a linear relationship in 
B for relativistic theory. To circumvent this problem and allow us to still make a linear 
plot, we recast Equation 2.3 into a format where we have a linear relationship between 
some quantity and the variable B. Here we multiply (3 by 7. 

A* = (^0 ( 1 - (^1 I I (2.5a) 



A* -(;£*)* (2.5b) 

Now we have a linear relationship for our relativistic theory, from which we can predict 
the charge to mass ratio of the electron from the slope of the line! 

With Equation 2.5 we can better understand the curves presented in Figure 2.1. Look- 
ing at our classical prediction from Equation 2.5a we no longer expect a linear relation- 
ship, and expect that for large B values, the value for /?7 will explode. This behavior 
can be seen in the blue line of Figure 2.1. As B grows large the value /?7 grows rapidly, 
and most certainly in a non-linear fashion. 

The green curve, corresponding to relativistic theory, is linear as we expect from 
Equation 2.5b. Notice that our data points match very well with relativistic theory; all 
except one data point falls within its uncertainty on the green line! 

The final red curve in Figure 2.1 gives the best linear fit of the data points. What we 
mean by "best" will be discussed in the following section. The general idea, however, 
is that a linear fit matches the data points well, and allows us to calculate the mass to 
charge ratio of the electron from the slope of our fit. For this fit we have obtained values 
of, 

m = 0.0129 ± 0.0021, b = -0.08 ± 0.21 (2.6) 
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where m is slope and b is y-intercept as defined previously in Equation 2.4. If we plug in 
our value for p and c 2 we can then calculate out the charge to mass ratio of the electron. 



( I c 2 



0.0129 — (2.7) 

m e \p J 

The core idea to come away with from the discussion above is how easy it is to 
represent theory and experiment, and their subsequent agreement with just a single 
plot. In Figure 2.1 we have shown how drastically different the theoretical predictions 
of classical and relativistic mechanics are, and we have shown that the data matches the 
relativistic prediction, not the classical prediction. Furthermore, by fitting the data we 
have managed to calculate the ratio of two fundamental constants of nature, the mass 
of the electron and the charge of the electron. Remember that to obtain a linear best fit 
we had to recast the equations in such a way as to provide a linear relationship. This is 
a technique that will be used throughout this book. 



2.3 Fitting Methods 

In the previous section we discussed the curve of "best" fit for the data points in Figure 
2.1. But what exactly do we mean by "best", and how do we determine the fit? To fully 
explain this we must first introduce some new definitions. 

To begin, we define a residual. 3 If we have some observed value y but expect the 
value of y e then the residual is just y Q — y e . We can apply this definition to the plot of 
Figure 2.1. We can think of the best fit line as the expected values y e , and the actual 
data points as the observed values y . Using common sense we can define what a best 
fit line is; it is the line that minimizes the sum of the residuals between the line and the 
data points. In other words, we want every data point to fall as closely as possible to 
the best fit line. We can adjust the parameters m and b for the line accordingly until we 
have the smallest possible sum of residuals for the line. 

Before we apply this idea to the plot we must take into consideration two problems. 
The first problem is that the residuals can be both positive and negative, and so we could 
minimize the total residuals to zero, while still having very large positive and negative 
residuals that cancel each other out. We can negate this effect by adding the residuals 
together in quadrature, denoted by the © symbol. By this, we mean that instead of 
adding the residuals together, we add the squares of the residuals. 4 



' Residuals are not just used for experimental data but are also very important in numerical analysis. 
In reality, fitting methods and their theory is more in the realm of numerical analysis than physics. 
Because of this it is important that physicists have a strong grasp of numerical analysis. 
There is a more detailed mathematical explanation as to why we add the residuals in quadrature, and 
this stems from the theory behind chi-squared distributions. A more intuitive way to think of 
the residuals is to think of them as components of a vector, and we are trying to find the magnitude 
of the vector. The same method is used for explaining why we add uncertainties in quadrature, as is 
described in Chapter 1 of this book. 
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The second problem is that we don't want data points with very large uncertainty to 
effect the placement of the line just as much as data points with small uncertainty. For 
example, consider an example where we have made three measurements with associated 
uncertainty of 1.0 ± 0.1, 2.0 ± 0.1, and 20 ± 10. For the last data point the experimenter 
was distracted by a ninja and so the uncertainty is huge (nearly 50%). We might not 
want to throw out the last data point, but we certainly don't want our line of best fit 
to consider the last data point equally with the first two. To ensure the first two data 
points are considered more than the final data point, we must weight the residuals by 
the associated uncertainty for that specific measurement. We do this by dividing the 
residual by the uncertainty. Subsequently, large uncertainty makes the residual smaller 
(and it matters less), and small uncertainty makes the residual larger (it matters more). 
Summing these weighted residuals in quadrature yields a value called the chi-squared 
value, denoted by the Greek letter x 2 ■ 

Here we have N data points yi measured at the variable Xi with an associated uncertainty 
of (Tj. Additionally, the value f(xi) is the value calculated by the curve we are fitting to 
the data points at the variable Xj. 

We now have a method by which we can find the line of best fit for a given set of data 
points; we minimize \ 2 as is given in Equation 2.8. But what if we wish to compare the 
goodness of our fit for one set of data with another set of data? By looking at Equation 
2.8 for x 2 we see that if the number of data points increase but the uncertainty on each 
data point remains constant, our x 2 wm increase. This means that by our definition of 
best fit, more data points means a worse fit. This of course does not make sense, and so 
we need to modify Equation 2.8 slightly. 

Before we modify Equation 2.8, we need to note this equation does not just apply to 
fitting data points with lines, but with any arbitrary curve, as f(xi) can be determined 
by any arbitrary function! If we are going to modify Equation 2.8 so we can compare 
the goodness of different linear fits, we may as well modify it so that we can compare 
the goodness of any arbitrary curve. To do this we define the number of degrees of 
freedom (commonly abbreviated as NODF) for a fit as, 

u = N-n-l (2.9) 

where TV is the number of data points being fitted, n is the number of parameters of 
the curve being fit, and the Greek letter v (spelled nu) is the number of degrees of 
freedom. The number of parameters for a curve is the number of variables that need to 
be determined. For the line there are two variables, the slope, m, and the y-intercept b. 
For a second degree polynomial such as, 

y = ao + a\x + (12X (2-10) 

there are three parameters, ao, ai, and ai- Now, by using the number of degrees of 
freedom from Equation 2.9 and X2 from Equation 2.8 we can define the reduced chi- 
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squared of a fit which allows for the comparison of the goodness of a fit using any 
number of data points and any arbitrary curve. 



x 2 = *! = 1 y (yj-f^j)) 2 {2U) 

The number of degrees of freedom for a curve must always be greater than to fit 
that curve to the data points using the minimization of xt method. We can think of this 
intuitively for the case of a line. A line is defined by a minimum of two points, so if we 
are trying to fit a line to a data set with two data points we see there is only one line we 
can draw, and v = 0. This means we don't need to bother with minimizing xt because 
we already know the solution. Another way to think of the number of degrees of freedom 
for a fit is for larger v the fit is more free and for smaller v the fit is more confined. If we 
add ten more data points to our data set in the example with the line, v becomes 9 and 
the fit of the line is more free because it has more data points to consider when finding 
m and b. If we change the type of curve we are fitting to the second degree polynomial 
of Equation 2.10 v now becomes 8 because we have added an extra parameter. This 
decreases the freedom of the fit because the fit must now use the same number of data 
points to find more parameters; there are essentially less options for the parameters of 
the curve. 

Looking at Equation 2.11 we can now make some general statements about what 
the xt f° r a curve of best fit indicates. We see that xt becomes very large, %^ S> 1, 
when either the values for <7j are very small, or the differences between the observed and 
expected results are very large. In the first case the experimenter has underestimated 
the uncertainty but in the second case the fit just does not match the data and most 
probably the theory attempting to describe the data is incorrect. If the xt ls verv 
small, %^ <C 1, either the uncertainties are very large, or the residuals are very small. 
If the uncertainties are too large this just means the experimenter overestimated the 
uncertainty for the experiment. However, if the residuals are too small this means that 
the data may be over- fit. This can happen when the number of parameters for the 
fitting curve is very near the number of data points, and so the best fit curve is able to 
pass very near most of the points without actually representing a trend in the data. 

The above discussion is a bit complicated but the bottom line is that a xt near one 
usually indicates a good fit of the data. If xt ls too small either the fitting curve has 
too many parameters or the uncertainty was overestimated. If xt ls t°° large the fitting 
curve is either incorrect, or the uncertainty was underestimated. In Figure 2.1 the xt 
value, displayed beneath the line of best fit, is given as 0.29. This value is less than 1 but 
still close (same order of magnitude) and so the fit is a good fit. Perhaps the uncertainty 
was slightly overestimated for the data points, but not by much. 

Now let us consider in more detail the fit of Figure 2.1. The curve of best fit has 
xt = 0.29, which we can calculate explicitly by using the best fit parameters of Figure 
2.7, the x and y values for each data point, and their associated uncertainty. The X{ and 
Hi values can be read from Figure 2.1 and then /(xj) = mxi + b calculated using m and 
b from Equation 2.7. However, this can be a bit tedious and so the numbers Xi, m, ai, 
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Xi [Gauss] yi [unitless] <7j [unitless] f{xi) [unitless] 
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Table 2.1: Data points used in Figure 2.1 with associated uncertainty. Column one gives 
Xi, the magnetic field B in Gauss. Column two gives yi, the /?7 of the electron, 
and column three its associated uncertainty, a\. Column four gives the value 
calculated for the point Xi using the best fit parameters of Equation 2.7 in 
Equation 2.4. 



and f(xi) are provided in Table 2.1. 5 

We have 9 data points so N = 9. We are fitting with a line so n = 2 and consequently 
v = 6. Using the numbers from Table 2.1 we can calculate Xu from Equation 2.11 for 
the line of best fit in Figure 2.1 in all of its gory detail. 
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0.981 - 0.952 
0.073 

1.126-1.146 
0.066 

1.374 - 1.340 
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1.198- 1.211 
0.054 

1.435 - 1.404 



+ 



+ 



1.036-1.081 
0.069 



1.292- 1.275 V 



0.034 
1.398 - 1.469 



■ 



0.089 J \ 0.125 J \ 0.096 

(0.397) 2 + (0.014) 2 + (-0.652) 2 + (-0.303) 2 + (-0.240) 2 

+ (0.500) 2 + (0.382) 2 + (0.248) 2 + (-0.740) 2 

0.158 + 0.000 + 0.425 + 0.092 + 0.058 + 0.25 

+ 0.146 + 0.062 + 0.548 
1.739 



(2.12) 



6 
0.290 



5 Feel free to check them. It is not only a good exercise to just do the calculations, but also to check 
the author's work. 
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We can also check to ensure that the values for m and b from Equation 2.7 give the 
minimum xt D y changing the best fit values of m and b slightly. We do not explicitly 
write out the calculations, but the results in Table 2.2 can be checked using the exact 
same method as Equation 2.12 but now using different f(xi) values based on the change 
in m and b. 
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6 + 0.01 



m-0.01 


401.6 


393.8 


386.0 


m 


0.329 


0.290 


0.329 


m + 0.01 


386.0 


393.8 


401.6 



Table 2.2: The reduced chi-squared for small perterbations around the best fit parame- 
ters given in Equation 2.7 for Figure 2.1. 



From Table 2.2 we can see that the parameters given in Equation 2.7 actually do 
minimize the xt f° r data points of Figure 2.1. More importantly we see how much 
the values change over small variations in the parameters. Figure 2.2 shows the same 
behaviour of xt as Table 2.2 but now visually represents the change of xt with a three 
dimensional surface. The lowest point on the plot corresponds to the best fit parameters 
of Equation 2.7 and the lowest xt value of 0.29. 

If a marble was placed on the curve of Figure 2.2 it would roll to the lowest point. 
This is exactly how the minimum reduced chi-squared for a line of best fit is found. 6 For 
a curve with more than two parameters (take for example the polynomial of Equation 
2.10) the marble is just placed on an n-dimensional surface. This is of course more 
difficult to visualize, but the principal is exactly the same as for the line. 

Looking back at Equation 2.7 we see that there are uncertainties associated with the 
best fit parameters. There is a rigorous deriviation for these uncertainties, but it is also 
possible to visualize them using the marble analogy above. We now take the marble and 
place it at the minimum that we previously found by dropping the marble, and push the 
marble with an equal amount of force in both the m and b directions. The marble will 
begin oscillating back and forth around the minimum point, but the oscillations will not 
be the same in both the m and b directions. Figure 2.2 looks somewhat like a channel, 
and so we expect to see rather large osciallations along the direction of b, but very small 
ones along the direction of m. The size of these oscillations correspond directly with the 
size of the uncertainties associated with in and b. Looking back at the uncertainties in 
Equation 2.7 we see that indeed we do have very small uncertainty on m, and a much 
larger uncertainty on bl 



The method for minimizing \ v varies from numerical package to numerical package, but most use 
what is known as the Levenberg-Marquardt algorithm which is a combination of the Gauss-Newton 
algorithm and the steepest gradient method. 
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2 Plots 




.02 



-0.04 



Figure 2.2: The surface plot for the minimization of the reduced chi-squared of Figure 
2.1. Notice how the minimum occurs near b = —0.08 and m = 0.0125, 
corresponding to the best fit parameters given in Equation 2.7. 
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3 Newton's Laws 



In 1687 Isaac Newton published his PhilosophiceNaturalis Principia Mathematica and 
revolutionized the field of physics (some might even say create). Within the Principia 
Newton postulated his famous three laws of motion. 1 

1. Every body perseveres in its state of rest, or of uniform motion in a right line, 
unless it is compelled to change that state by forces impressed thereon. 

2. The alteration of motion is ever proportional to the motive force impressed; and is 
made in the direction of the right line in which that force is impressed. 

3. To every action there is always opposed an equal reaction : or the mutual actions 
of two bodies upon each other are always equal, and directed to contrary parts. 

Perhaps a more succinct and modern version will help convey the simplicity of the laws. 

1. An object at rest will remain at rest and an object in motion will remain in motion 
unless acted upon by an external force. 

2. Force is equal to mass times acceleration. 

3. For every action there is an equal and opposite reaction. 

The laws themselves are deceptively simple. It seems obvious that an object at rest 
will remain at rest, yet before Newton, no one had ever considered this as a physical 
law! Because people saw this behavior every day they took it for granted, and did not 
even realize it was a general rule. Oftentimes the quote "the exception proves the rule" 
is used. In the case of Newton's three laws there were no exceptions, and so no one 
realized the rule. 

These three laws make up what is known as Newtonian mechanics and remained 
undisputed for over 200 years until the advent of relativity and quantum mechanics. 
Relativity describes objects going very close to the speed of light, much faster than 
anything seen in day-to-day life. Quantum mechanics describes objects that are much 
smaller than day-to-day life, on the same order magnitude as the size of the atom, 
or smaller. The combination of these two fields of physics is relativistic quantum 
mechanics. The beauty of both relativity and quantum mechanics is that the limits of 
these theories (very slow for relativity, or very large for quantum mechanics) approach 
Newtonian mechanics. 



Newton's Principia : the mathematical principles of natural philosophy translated by Daniel Adee from 
the original Latin. 1846. The full text can be downloaded here (it is a very large pdf). 



35 



3 Newton's Laws 

One of the primary tools in Newtonian mechanics is the use of free-body diagrams. 
In these diagrams all the relevant forces for a system (usually involving some ridiculous 
combination of sliding blocks and pulleys) are represented as force vectors, and the 
motion of the system can then be calculated. These are also sometimes called force 
diagrams and help us visually apply the three laws of motion. Consider for example a 
block being pushed across a frictionless plane. By the first law we know the block must 
be moving because it is experiencing an external force. By the second law we know the 
force being exerted on the block is equal to its mass times its acceleration. And by the 
the third law we know the block is not falling, as the frictionless plane is providing an 
equal and opposite force countering the force of gravity on the block. 

Sometimes free-body diagrams can become very complicated, and it is simpler to use 
what is called Lagrangian mechanics. Here, a much more mathematical approach is 
used to derive the equations of motion for the system, rather than visualizing the 
physical reality. Because of this, free-body diagrams help provide a much more intuitive 
approach to physical problems, while Lagrangian mechanics sometimes provide a simpler 
method. 



3.1 Experiment 

Perhaps the second law of motion is the least intuitive of all the laws. We know the 
first law is true because objects sitting on desks or lab benches don't just get up and 
walk away unless some sort of force is applied. The third law is clearly true or else when 
sitting down, we would fall right through our chair. The second law however is not quite 
so obvious, and so in the experiment associated to this chapter, the second law is verified 
using two methods. In the first method we vary the mass of the object, while in the 
second method we vary the force acting on the object. The setup for the first method 
consists of a cart moving along an almost frictionless track. Attached to this cart is a 
string which runs over a pulley and is attached to a small weight. Gravity pulls the 
weight downwards which subsequently causes a tension in the string which acts on the 
cart. 

The first step in approaching this problem is to draw a free-body diagram, as is done 
in Figure 3.1(a). From this diagram we can see that the tension on the string, T, must 
equal the force on the hanging weight m w which is just m w g. We also can see that 
the only force acting on the cart is the tension, so by Newton's second law we have 
m c a = m w g. Here m c is the mass of the cart. From this we know the acceleration of the 
cart. 

a = ^^- (3.1) 

m c 

We see that if we increase the mass of the cart, the acceleration will decrease. If we verify 
Equation 3.1, then we have verified Newton's second law. We can do this by changing 
the mass of the cart m c while applying the same force, and measuring the acceleration 
of the cart. 
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m w g 




m c g sin 



(a) Pulley Method 




(b) Ramp Method 

Figure 3.1: Free-body diagrams for the two methods used to verify Newton's second law 
in the experiment associated with this chapter. 
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3 Newton's Laws 

With the second method, we roll the cart down an inclined plane as shown in the 
free-body diagram of Figure 3.1(b). Now, the force acting on the cart is just, 

F = m c g sinO (3.2) 

and so by varying 9 we can change the force on the cart. From this we can theoret- 
ically calculate the acceleration using Newton's second law and compare this to our 
experimentally determined values. 

a = gsmO (3.3) 

Looking at this equation we see that if we plot sin 9 on the x-axis and our measured a 
on the y-axis we should obtain a straight line. Furthermore, the slope of this straight 
line should be g\ 
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4 Momentum 



Oftentimes the question "what's the point, what can I actually do with physics?" 1 is 
asked by frustrated physics students (or non-physics students forced to study physics). 
The answer is almost anything, although this is not always readily apparent. However, 
one area where the use of physics is clear is in the modeling of kinematics or the 
interactions between objects. But how does modeling kinematics help us? Everything 
from video game physics engines to crime scene investigators use kinematics and the 
fundamental laws of physics to recreate realistic physical realities. 

Another extremely useful 2 application of kinematics is the analysis of movie scenes to 
determine if they are physically possible. Consider the opening scene to the 2010 "Star 
Trek" movie where a young Captain Kirk has stolen his step-father's vintage Corvette 
and is driving through a corn field in Iowa. A police officer tries to pull Kirk over, but 
he refuses and accidentally drives the Corvette off a cliff during the ensuing pursuit. 
Luckily he manages to jump out of the Corvette just in time and grabs the edge of the 
cliff. 3 Is this scene physically possible? 

First we need to make a few assumptions and rough estimates. We can estimate the 
mass of the car to be approximately 1400 kg, m c = 1400, and the mass of Kirk to be 
around 55 kg, m& = 55. From an earlier shot in the scene we know that the car (and 
Kirk) has been traveling near 75 mph or 33.5 m/s, v l c = 33.5. After the collision we 
know that Kirk has a velocity of zero as he hangs onto the cliff face, vt = 0. 

By using conservation of momentum which requires the momentum in a system 
to be conserved (the same before as it is after), we can find the velocity of the car after 
Kirk jumps out. 

Pi = 33.5 x 1400 + 33.5 x 55 = v{ x 1400 + x 55 => v{ = 34.8 m/s = 125 km/h (4.1) 

The action of Kirk jumping out of the car increases the speed of the car by 5 km/h. But 
we can go even further. By Newton's second law we know that force is equal to mass 
times acceleration, 

F = ma = -+- (4.2) 

At v ; 

which we can also write in terms of change of momentum. We already know the change 
of momentum for Kirk, so to find the force all we need is the time period over which 



Oftentimes with slightly stronger language. 
2 Useful is highly dependent upon the eye of the beholder. 
' Sorry about the spoiler, I know everyone was hoping Kirk would die. 
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the momentum changed. Let us assume he managed to perform his entire jump within 
a second, so At = 1 s. 



F, 



33.5 x 55 - kgm/s 

n 



1842.5 N 



(4.3) 



The average adult can lift over a short period around 135 kg with their legs, equivalent 
to a force of ~ 1350 N. Assuming that Kirk has the strength of an adult, we see that 
Kirk would miss by about 500 N and plunge to his death over the cliff. 4 

4.1 Collisions 



In kinematics there are two types of collisions, inelastic and elastic. In an inelastic 
collision kinetic energy is not conserved, while in an elastic collision kinetic energy is 
conserved. Furthermore, there are two types of inelastic collisions, total and partial. 
At the end of a totally inelastic collision all of the individual objects are moving with the 
same velocity. In a partially inelastic collision, this does not occur. Explosions, similar 
to the example given above, correspond to the reverse of a totally inelastic collision. 
We begin with a single object moving at a single velocity, and end with a multitude of 
objects at different velocities (in the example above, Kirk and the car). Conservation 
of energy still occurs for inelastic collisions; the lost kinetic energy is usually converted 
into potential energy through the deformation of the colliding objects. 



Pi = m A v A + m B v l B 

► 



Pf = m a v f A + m B v f B 



V) 




/ 




Y 




\ 




m A 






m B 




V 




A 




J 



2X3U2ZZX 



(a) Initial 



(b) Final 



Figure 4.1: An example collision of two objects, A and B with conservation of momen- 
tum. 



As an example let us consider a simple setup, shown in Figure 4.1. Object B of mass 
tub is moving with an initial velocity v B on a frictionless track when object A of mass 
ra A is fired with an initial velocity of v' A at object B. After the collision object A has 
final velocity v A and object B has final velocity v B . Note that in Figure 4.1 the velocities 
are not necessarily in the direction indicated, but are drawn as an example. 



4 Yes, this example was written shortly after having just watched "Star Trek". In all honesty this scene 
falls much closer to reality than quite a few other scenes from the movie and in terms of popular 
Alms is about as close to a realistic physical situation as you will see. 
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4.1 Collisions 
If the collision between object A and object B is elastic, kinetic energy is conserved. 

1 -2 1 -2 I / f \ ^ 1 / f \ ^ 

-m A (v\) + -m B (^) = -m A (v A J + -m B (^£J (4.4) 

f f 

Here we have one equation and six unknowns, rriA, rnB, v A , v B , v A , and v B . Normally, 

however, we know the conditions before the collision, and want to determine the result. 

In this scenario we would then know rriA, rriBi v \i an d v B , but still have two unknowns, 

f f 

v A and v B . With two unknowns we need two equations to uniquely determine the 

solution, and so we use conservation of momentum to impose our second relation. 

rn AV % A + m B v B = rriAV A + m B v B (4.5) 

We can simultaneously solve the system of equations above using substitution to find 



VA 

(4.6) 



: for v A and v B . 






f mAV A + "»b (2v B - 


-<) 


/ 


V A 

m-B + tua 


•) 


V A 


j m B v' B + tua (2v A - 


~v B ) 


f 


" B m B + rriA 


; 


V B 



v B 



Because of the quadratic terms in Equation 4.4 we have two solutions for both v A 
and v B . Physically the first solution corresponds to when the two objects collide with 
each other after some time period. The second set of trivial equations, where the initial 
velocities of the objects match their final velocities, corresponds to when the objects do 
not collide. This can occur when object A is fired at B with a velocity slower than B, 
when A is fired away from B and B is moving at a velocity slower than A, or when A is 
fired in the opposite direction of B. 

Let us now consider a real world example of an elastic collision. Elastic collisions are 
somewhat rare, but in pool, the interactions between the pool balls are almost completely 
elastic. Of course the balls roll, which adds another level of complexity (now angular 
momentum and kinetic energy must be conserved as well) but for now let us assume 
that the balls slide and do not experience friction. Consider hitting the cue ball (object 
A) at the eight ball (object B) in the final shot of a game. We know that the masses 
of the two balls are around 0.10 kg and that the eight ball begins at rest. Let us also 
assume that we know the speed of the cue ball to be 10 m/s. 5 

Plugging all these quantities into Equation 4.6, we can determine the final velocities 
of both the cue ball and the eight ball. 

f 0.1x10 + 0.1(2x0-10) 1-1 

v \ = = = m/s 

A 0.1 + 0.1 0.2 ' (47) 

f 0.1 x + 0.1(2 x 10-0) 2 _ , [ ' ' 

v D = = — = 10 m/s 

B 0.1 + 0.1 0.2 ' 



5 



These numbers, while approximate, are relatively close to actual values that would be measured in a 
pool game. 
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From the results above, we see that after the collision the cue ball is completely at rest 
while the eight ball is now moving with the initial velocity of the cue ball, and has 
hopefully gone into the pocket. 

Let us return to the setup of Figure 4.1 and assume a completely inelastic collision 
now. From this we know that objects A and B are moving at the same final velocity. 

v{ = v f B (4.8) 

Again we impose conservation of momentum from Equation 4.5, and solve for the final 
velocities in terms of the initial velocities and the mass of the objects. 

/ / rn A v\ + m B v l B 

v a = v v. = — (4.9 

A B m A + m B 

We can use the pool example above to again explore these results, but replace the eight 
ball with a lump of clay that sticks to the cue ball after the collision. 



/ _ / 0.1 x 10 + 0.1 x 
, A~ V B 0.1 + 0.1 



V A = V B = ~ TT-i Kl = 5 m / s ( 4 - 10 ) 



Both the clay and the cue ball are traveling at half the initial velocity of the cue ball. 
Unlike the elastic collision, the cue ball has a positive non-zero velocity, and so the shot 
would be a scratch. If collisions in pool were inelastic and not elastic, the game would 
be nearly impossible; every straight on shot would be a scratch! 

The methods and equations outlined above are valid for any type of two-body col- 
lision. While the examples were given in one dimension (along the x-axis only), they 
can be applied to three dimensional problems as well. Now the velocities are given by 
vectors, but these can be split into their components, v x , v y , and v z , and Equations 4.6 
and 4.9 can be applied to each component individually. 

4.2 Experiment 

In deriving Equations 4.6 and 4.9 we made the assumption that momentum is conserved 
without any basis to do so. The goal of this experiment associated with this lab is to 
verify the theory outlined above by experimentally confirming conservation of momen- 
tum. 

The apparatus used in this lab consists of two carts of equal mass that move on a near 
frictionless track. The idea is that these carts can be collided with velocities and masses 
measured before and after the collision so that the two momentums can be compared. 
The experiment is split into three investigations. In the first investigation cart A is fired 
at cart B and the initial and final velocities of the carts are measured. In the second 
investigation, cart A is still fired at B, but now the mass of B is changed. In the final 
investigation cart A and B are placed back to back, and fired apart in an "explosion". 

One important point to remember for this experiment is that the collisions between 
the carts are neither elastic, nor fully inelastic. This is because the carts have both 
velcro and magnets on them which dampen the collisions. Consequently, the behaviors 
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Figure 4.2: Theoretical final velocities for carts A and 5 in elastic (red and blue curves) 
and fully inelastic (green curve) collisions. The initial conditions for the 
collision are v\ = 0.5 m/s, v B = m/s, and VfiA = 0.5 kg which are similar 
to values obtained in the experiment. 



of the carts will not resemble the pool ball examples above, but be a combination of the 
two. 

In the second investigation (where the mass of B is changed), the lab manual requests 

f f 

for plots to be be made of v A against m# and v B against mg. In Figure 4.2 these plots 

have been made for the elastic and fully inelastic cases using Equations 4.6 and 4.9. 

Here the solid green line gives the velocity of both cart A and B after a fully inelastic 

collision. This curve is proportional to 1/rriB as expected from Equation 4.9. For an 

f f 

elastic collision the blue dashed line gives v A and the red dotted line gives v B . Notice 

that for any mg > rriA, cart A bounces backwards off cart B. 

Because the collisions between the carts in this lab are partially inelastic, the plots 
from experiment will not match Figure 4.2 but fall between the two curves (assuming 
same v A which will not be the case, but should be close). The v A data points should 
fall below the green curve but above the blue curve, while the v B data points should fall 
above the green curve but below the red curve. It is important to understand that these 
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plots when made in the lab should not be fitted with any function because we do not 
have a theoretical form for the fitting curve. 
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5 Rotation 



Linear motion describes many of the interactions around us in the physical world. Ob- 
jects under the influence of gravity fall linearly, cars normally accelerate in a straight 
line, and the collisions of pool balls are most easily visualized and solved within a linear 
system. But sometimes, linear motion simply is not sufficient. A perfect example of 
this can be found in the example of compound pendulum, given in Chapter 6. Here, 
attempting to find the period of a compound pendulum using linear motion is excep- 
tionally complicated. Instead, we introduce a new force system, rotational motion, to 
help solve the problem. 

Rotational motion is everywhere. Every time a door is opened, principals of rota- 
tional motion are exhibited. Cars accelerating around turns, satellites orbiting the earth, 
wound clocks, weather systems, all of these phenomena are more intuitively described in 
a rotating system, rather than a linear system. More examples of systems that are best 
solved in a rotating frame are readily available in almost any introductory physics text 
book. Further details and development of rotating systems can be found in An Intro- 
duction to Mechanics by Kleppner and Kolenkow as well as the MIT OpenCourseWare 
materials for 8.012 as taught by Adam Burgasser. 



5.1 Coordinate System 

But what exactly do we mean by a rotating system, or rotational mechanics? In linear 
motion we describe the interactions of objects through forces using the Cartesian coor- 
dinate system. Using this coordinate system is convenient because forces, accelerations, 
and velocities are oftentimes in straight lines. In rotational mechanics we don't change 
any of the fundamental laws of physics, we just change our coordinate system. We then 
transform the relations we have for linear motion to this new rotational frame. Perhaps 
the easiest way to understand this is to make a direct comparison between the quantities 
used to describe a system in linear and rotational frames. 

To begin, we must introduce the two coordinate systems. In linear motion Cartesian 
coordinates are used; the location of an object or the components of a vector are given 
by x, and y as shown in Figure 5.1(a). In the simplest one-dimensional case (such as 
the examples in Chapter 4), linear position is given by the variable x. In rotational 
motion polar coordinates are used (or in the three dimensions, spherical coordinates) 
as shown in Figure 5.1(b). Here the position of an object or the direction of a vector 
is described by the variables 9 (the Greek letter theta) and r where the units for 9 are 
always radians, and r is a distance. In the one-dimensional case for rotational motion, 
angular position is given by the variable 9. Consequently, we see that x transforms 
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(a) Cartesian Coordinates 




(b) Polar Coordinates 



Figure 5.1: The Cartesian and polar coordinate systems used in linear and rotational 
motion. 

into and y into r for rotational motion. 1 

In linear motion, the velocity or speed of an object is described by, 



Ax 
At 



(5.1) 



where Ax is change in position and At is change in time. In rotational motion, time 
remains the same, as changing the coordinate system does not affect the passage of 
time. If we substitute A9 for Ax we arrive at the formula for angular velocity which 
is denoted by the Greek letter u> (spelled omega). 



w 



Ac? 
At 



(5.2) 



Angular velocity is given in units of radians per unit time. 2 Alternatively, angular 
velocity can also be thought of as number of rotations per second, and is sometimes also 
called angular frequency. 3 For the purposes of this chapter, we will always refer to u> 
as angular velocity. 

Now that we have position and speed for an object in rotational motion, the only 
remaining quantity we need to describe the motion of an object is acceleration. In linear 
mechanics, 



Av 
At 



(5.3) 



1 Oi course it is possible to also look at the three dimensional case, in which case spherical coordinates 
are used. However, as the variables used in spherical coordinates are not consistent across disciplines 
and can cause confusion, we stick to the two dimensional case for this discussion. 
We drop the vector sign for angular velocity as we always know it is either in the plus or minus 9 
direction. 

' Angular frequency is the magnitude of angular velocity. 
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5.2 Moment of Inertia 

where Av is change in velocity. Again, we just replace linear quantities with rotational 
quantities. 

a = ^ (5.4) 

At v ; 

Here, angular acceleration is represented by the Greek letter a (spelled alpha). 4 The 
units of a are just radians per second per second. Now we have a full arsenal of quantities 
to describe motion in a rotational system. 

5.2 Moment of Inertia 

Before we are able to describe physical laws with rotational motion, we need to introduce 
a rotational quantity analogous to mass. In a linear system mass is just given by the 
quantity m. In a rotational system, the equivalent of mass is given by moment of 
inertia oftentimes denoted by the letter /. The moment of inertia for an object is 
defined as, 

I = r dm (5-5) 



where dm is an infinitesimal mass, and r the distance of that infinitesimal mass from 
the axis around which the object is rotating. Let us consider an example to illustrate 
this. A mass m is rotating around an axis Pat a distance r. If we assume the mass is 
a point mass, the moment of inertia for this system is just mr. 

Now let us consider a slightly more difficult example. In Figure 5.2(a) a rod with 
length £ is rotating about its center of mass horizontally. We can then write, 

e/2 pi/2 

r dm = / r fidr (5.6) 

-e/2 J -e/2 

where we assume the rod has a uniform linear density of \x so dm = \xdr. Taking this 
integral we obtain, 

fir 3 Y /2 _ 2(j, fl 3 \ _ [ill 2 _ ml 2 

~3~J _ e/2 ~ ~3~ \~8j ~ ~1~2~ ~ ~V2 ' ' ' 

where in the final step we realize that the mass of the rod, m, is equal to fj,£. 

There are two very important properties to remember when dealing with moments of 
inertia. The first is that moments of inertia can be added if the objects are rotating 
about the same axis. The second property is the parallel axis theorem. This theorem 
states that if an object has a moment of inertia I cm around its center of mass, then 
the moment of inertia for that object when rotating around an axis distance r from the 



Again, we drop the associated vector sign as we can express direction with positive and negative values. 
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r = -1/2 



r 4 



m(- 

12 



r = £/2 



(a) Center of Mass 



center of mass 




(b) End 



Figure 5.2: The moments of inertia for a rod rotating about its center of mass and 
rotating about its end. 



18 



5.3 Momentum, Energy, and Force 

center of mass is just its center of mass moment of inertia plus the the objects mass 
times r squared. 

I = I cm + mr 2 (5.8) 

Figure 5.2(b) demonstrates the use of the parallel axis theorem for the same rod of 
Figure 5.2(a). Now the rod is rotating about one of its ends, so r = 1/2. This gives us, 

ml 2 mi 2 m£ 2 

1= 1 = (5.9) 

12 4 3 v ' 

where we have used Equations 5.7 and 5.8. Notice that in this case the moment of inertia 
increases. This is because we have moved more of the mass of the rod away from the 
axis. It is very important to realize that the moment of inertia for an object can change 
without the mass of the object being changed, merely the distance of the mass from the 
axis of rotation. 

5.3 Momentum, Energy, and Force 

In linear motion momentum is always conserved (as hopefully shown in the momentum 
lab for this course) and so the same must apply to rotational motion. In linear motion, 
momentum is just mass times velocity. 

p = mv (5.10) 

In rotational motion, we just replace mass with moment of inertia and velocity with 
angular velocity. 

L = Iu (5.11) 

The resulting quantity, L, is called angular momentum. The angular momentum for 
a system, just as the momentum for a system in linear mechanics is always conserved. 
This is why a figure skater can increase the speed of their spin by pulling in his or 
her arms. This lowers the moment of inertia of the skater, and so to conserve angular 
momentum, the angular velocity of the spin must increase. 

Another important derived quantity in physics is kinetic energy. 5 

K t = -mv 2 (5.12) 

The subscript t on the K stands for translational, which is just normal linear kinetic 
energy. In rotational motion we have angular kinetic energy, 

K r = -Ilo 2 (5.13) 



' Kinetic energy is extremely useful for helping solve a variety of problems, especially in elastic collisions 
where it is conserved. See Chapter 4 for more detail. 
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Linear Motion 
Quantity I Symbol 



Relation 



Rotational Motion 
Quantity I Symbol 



distance 


X 




angular distance 


e 


velocity 


V 




angular velocity 


U) 


acceleration 


3 




angular acceleration 


a 


mass 


m 


I — J r dm 


moment of inertia 


I 


momentum 


p = mv 


L — rp sin <j> 


angular momentum 


L = Iuj 


kinetic energy 


K t = \mv 2 




angular kinetic energy 


K r = \1^ 


force 


F — ma 


t — rF sin 4> 


torque 


r — la 



Table 5.1: A summary of useful physical quantities used to describe a system in linear 
motion and rotational motion. The middle column gives the relation between 
the linear and rotational quantities. 



where the subscript r stands for rotational. The total energy of a system, whether trans- 
lational kinetic energy, angular kinetic energy, or potential energy is always conserved. 
One final piece of the puzzle is still missing, and that is force. In linear motion, 



F = ma 



(5.14) 



by Newton's second law. In rotational motion a new quantity, torque (represented by 
the Greek letter r), is used. 6 



t = la 
If a force is applied to a rotating system the torque on the system is, 
r = rF sin d> 



(5.15) 



(5.16) 



where <fi is the angle at which the force is applied, and r is the distance from the axis of 
rotation. 7 

5.4 Comparison 

Quite a few new terms have been introduced in the sections above, which can be very 
daunting for someone who has never seen rotational motion before. The important point 
to remember is that for every quantity and law in linear motion, the same exact law 
or quantity is available in rotational motion. To help, Table 5.1 summarizes all the 
relations discussed above between linear and rotational motion. 

The equations of motion for an object moving under constant acceleration in linear 
motion can also be recast into rotational motion using the quantities discussed in the 
previous sections and summarized in Table 5.1. To do this, a simple substitution is made 
between linear quantities and rotational quantities with the results given in Table 5.2. 



6 Torque is a vector, but for the purpose of this lab it is presented just as a scalar. 
7 In full vector notation, f — f x F. 
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5.5 Periodic Harmonic Motion 



Linear Motion Rotational Motion 



x 

X 
X 
V 
V 

a 



v t + 
vt 



(it 1 

2 
2 

\ (v +v)t 

vq + at 



v-vp 



(IX 



LO t + 

cot 



00 
00 

a 



2 
at 2 
2 

\ (wo + oo)t 
loq + at 



\f~U& + <il~ 



4 

-UJQ 



Table 5.2: A summary of the equations of motion for an object moving under constant 
acceleration in both linear and rotational motion. Here, vq and ooq indicate 
initial velocity and initial angular velocity respectively. 



5.5 Periodic Harmonic Motion 

A defining characteristic of rotational motion is harmonic motion. Consider a rod ro- 
tating slowly at a constant angular velocity, but instead of viewed from above or below, 
viewed from profile. As the rod rotates, from the experimenter's view, the length of the 
rod will appear to grow and shrink periodically. When the rod is parallel with the ex- 
perimenter its apparent length will be its actual length, £. When at an angle of 45° with 
the experimenter the rod will appear to have a length of £/-\/2 and when perpendicular 
to the experimenter, the rod will appear to have a length of zero. Plotting the relative 
length of the rod as observed by the experimenter versus time will yield a sine wave, 
clearly periodic harmonic motion. 

Of course the pendulum is also another example of harmonic motion, most naturally 
understood by using rotational motion. 8 Another system, analogous to the pendulum 
(and to a simple linear spring oscillator) is the torsion spring. With a simple linear 
oscillator, force is related to displacement by Hooke's law. 



F 



-kx 



(5.17) 



Now we simply substitute the rotational quantities discussed earlier into Hooke's law to 
obtain a rotational motion form. 



-KX 



(5.18) 



Here, k is the torque constant, the rotational version of the linear force constant, k, 
with units traditionally given in Nm/radians. The period of oscillation for a harmonic 
oscillator governed by Hooke's law is given by 9 , 



T 



in 



(5.19) 



8 Understanding the basics of rotational motion is necessary for understanding the derivation of the 
period of a compound pendulum, as is done in Chapter 6. 
See Chapter 6 for a derivation of the period for a simple harmonic oscillator. 
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5 Rotation 

where T is period given in units of time. Using the rotational analogue of Hooke's law, 
we can find the period of a torsion spring by substitution. 

T = 2ir\- (5.20) 

V K 

This relation is very useful for finding the moment of inertia for complex objects, where 
the moment of inertia can not be calculated analytically. By placing the object on a 
torsion spring with a known torque constant, the period of oscillation can be timed, and 
the moment of inertia solved for. 

kT 2 , , 

1 = 4^ < 5 - 21 ' 

5.6 Experiment 

One of the best ways to begin acquiring an intuitive feel for rotational motion problems 
is to observe rotational motion in the lab setting. The experiment associated with this 
chapter provides two investigations, one to explore rotational motion under constant 
acceleration, and the second to explore periodic harmonic oscillations from a torsion 
spring. 

In the first investigation, to obtain constant angular acceleration on a rotating object, 
a constant torque must be applied. Looking back at Table 5.1 we can see that by 
applying a constant force at a constant radius, we obtain a constant torque. Of course 
one of the best ways to apply a constant known force is to take advantage of the force 
due to gravity on an object. 10 

By applying a constant angular acceleration to an object we can verify that our system 
for rotational motion is consistent. First we can calculate out a value for the torque being 
applied to the system by using Equation 5.16 and the moment of inertia for the system 
using Equation 5.5. From these quantities we can calculate a value for a using Equation 
5.15. Next we record angular position 9 versus time. This relation should be governed 
by the first equation of motion from Table 5.2 if theory is correct. As we can start the 
experiment with loq = we can plot versus t 2 and calculate a from the slope of the 
graph. 

We know that angular velocity is just change in angular position over change in time, 
and so from the 6 versus time data we can calculate out w. 11 We associate the calculated 
w with the average time used to calculate the time difference. If the fourth equation of 
Table 5.2 holds, then we can now plot ui versus t and again calculate out a from the 
slope of the plot. If our theory is correct, than the values for a calculated using the 
three different methods should match. 

The second investigation allows us to experimentally measure the moment of inertia 
for the rod used in the previous investigation, but more importantly validates the theory 



10 We used this exact same method in the experiment associated with Chapter 4. 

11 What we are doing here is numerical differentiation using a method known as the "finite difference" 

method. The specific method we use in the experiment associated with this chapter is equivalent to 

the "midpoint rule" oftentimes used for numerical integration. 
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5.6 Experiment 

behind moment of inertia. This portion of the lab consists of measuring the period for a 
rod with masses on it in different configurations. From the period, the moment of inertia 
for the rod and masses can be calculated and compared with the theoretical moment of 
inertia for a rod calculated earlier in this chapter. 
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6 Pendulum 



Pendulums in today's modern age may seem insignificant, but less than a century ago 
the simple pendulum was still the most accurate method for keeping time on the planet. 1 
With the development of the crystal quartz clock and now the use of nuclear sources, the 
pendulum has been rendered obsolete as a time keeping mechanism, yet remains as an 
integral part of most physics curriculums. This begs the question, why is the pendulum 
so important in physics? 

There are many answers to this question, but one answer dominates. The simple pen- 
dulum is an experimentally demonstrable, yet theoretically solvable example of simple 
harmonic motion accessible to most students without requiring an advanced knowledge 
of differential equations but can also be studied to a very advanced level of physics. 2 The 
combination of an analytically solvable theory with experiment is very rare in physics. 
Most real world scenarios do not have a theoretical solution as simple and beautiful as 
the solution to the pendulum. 

6.1 Simple Harmonic Motion 

Understanding the theory behind the physics of the pendulum begins with understanding 
simple harmonic motion (SHO). The mathematics behind SHO can at times become 
a little complex, but the end result is well worth the wait. To introduce SHO, let us 
consider the classic example used, a spring attached to a mass m at one end and an 
immoveable wall at the other end as shown in Figure 6.1. 

The block of mass m is displaced by a distance x from equilibrium, xq. Here, equi- 
librium is defined as the location where the potential energy of the system is minimized, 
or in this case, zero. Physically, this is where the block is placed such that the spring 
is not exerting a force on the block. The force on the block is then given by Hooke's 
law, 

F = -kx (6.1) 

where k is the spring constant and indicates how stiff the spring is. A large k means 
that the spring is very stiff, while a small k means the spring is easily compressed. Notice 
the negative sign in the equation above, this is intentional. When the block is pulled 
away from the wall, the spring exerts a restoring force drawing it back towards the 



1 Marrison, Warren. "The Evolution of the Quartz Crystal Clock". Bell System Technical Journal 27 

1948. pp. 510-588. 
An excellent paper demonstrating the incredible depth of the pendulum problem is available from 
Nelson and Olson. 
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6 Pendulum 




xo = 



X 



F = —kx 



vMil&SL 




Figure 6.1: Force diagram for simple harmonic motion for a block of mass m attached 
to a spring with force constant k. 



wall. Because we are working with a one-dimensional example, we have dropped the 
vector symbols for both force and displacement. 

Using Newton's second law we can relate the mass and acceleration of the block to 
the force exerted on the block by Hooke's law. 



ma 



m 



d 2 x 

d¥ 



-kx 



(6.2) 



This is a second order differential equation of degree one, for which the solution is well 
known. As this chapter is not about differential equations we will assume we know the 
solution, but for those curioius about the details there is an excellent book on differential 
equations, Elementary Differential Equations by Arthur Mattuck. 3 



x(t) = Acos (cot) 



(6.3) 



The equation above is a particular solution to Equation 6.2, but suits our needs 
perfectly. We see that the value A represents the amplitude of the oscillations (or the 
maximum displacement of the block), as the cosine function reaches a maximum value 
of one, and can be rewritten as the initial displacement of the block, or x. The value to 
is the angular frequency at which the block is oscillating. We can see that for every 
2n seconds the block will have returned to its initial position. 

Taking the first derivative of Equation 6.3 with respect to time we can find the velocity 
of the block as a function of time, and by taking the second derivative we can find the 
acceleration of the block also as a function of time. 



v(t) = — ujAsm(u>t) 
a(t) = —uo Acos(ojt) 



(6.4) 



3 This is the textbook used for the MIT OpenCourseWare materials for 18.03 also taught by Arthur 
Mattuck. This course is the most popular introductory math course at MIT, and with good reason, 
some of his more hilarious quotes are here. 
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6.2 Simple Pendulum 

Plugging acceleration and position back into Equation 6.2 we can find ui in terms of m 
and k. 

fk 

—muj Acos(ujt) = — kAcos(tjji) =4> muj —k => ui — \ — (6-5) 

V m 

From the angular frequency we can then find the period of oscillation, or the time it 
takes the block to return to its initial displacement distance, x. 

m 2-7T [m 

This result has one very important result, the period is not dependent on how far the 
block is initially displaced from equilibrium! This is a very important result for pendu- 
lums as will be shown shortly. 

6.2 Simple Pendulum 

The simple pendulum consists of a pendulum bob of mass m attached to a string 
of length £ pivoting about a pivot P. The string is assumed to have no mass, and the 
pendulum bob is assumed to be a point mass (all of its mass is within an infinitely 
small point). The force diagram for a simple pendulum is given in Figure 6.2(a). The 
pendulum bob is experiencing two forces, the tension of the string T from centripetal 
force, and gravitational force mg. The entire pendulum is displaced from equilibrium by 
an angle and or by an arc distance s. 

We can further split the gravitational force into two components, centripetal and tan- 
gential. The centripetal component must be cancelled by the tension of the string oth- 
erwise the pendulum bob would go flying through the air. The tangential force however 
is not counter-acted and so the pendulum will accelerate towards the equilibrium; this 
force acts as the restoring force for the system just as the spring provided the restoring 
force in the previous example. 

We can now write the equivalent of Hooke's law from Equation 6.1, but now for the 
pendulum. 

ma = —mgsmO = —ks = —k£6 =4> mgsmO = k£9 (6-7) 

In the first step we replace a with the tangential acceleration — mg sin 9. In the second 
step we replace the displaced distance x with the displaced arc length s. In the final 
step we write arc length in terms of 6 and length £ using simple trigonometry. 

We must now take one final step, and that is the small angle approximation. This 
approximation states that if the angle 9 is sufficiently small (usually under 10°) then 
sin 9 = 9 or cos 0=1. By looking at a unit circle you can convince yourself of the validity 
of this approximation. By making this approximation we can now solve k in terms of £, 
m, and g. 

k=J (6.8) 
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6 Pendulum 




m 

rngsinO |\ mgcost 
mg 

(a) Simple Pendulum 




tmg sin 9 



(b) Compound Pendulum 



Figure 6.2: Force diagrams for a simple pendulum and general compound pendulum. 

Furthermore we can place this k back into the formula we derived for period, Equation 
6.6, and find the period of the pendulum. 



T 




2ir^£g 



(6.9) 



We see that the masses cancel and the period is dependent only on g (a relative global 
constant) and the length of the pendulum £, not the mass or the initial displacement! 
This result, isochronism, demonstrates why pendulums are so useful as time keepers, 
and also shows how we can measure the value of g if we know the length of the pendulum. 
It is, however, important to remember that we did make use of the small angle approx- 
imation. This means that this equation is only valid when we displace the pendulum 
by small angles 6. Later on we will see the difference this approximation makes in the 
period of a pendulum at large angles. 

6.3 Compound Pendulum 

A compound pendulum is any rigid body which rotates around a pivot, as shown 
in Figure 6.2(b). The method outlined above for determining the period of the simple 
pendulum works well, but for the compound pendulum this method no longer works as 
gravity is now acting on the entire pendulum, not just the pendulum bob. To circumvent 
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6.4 Small Angle Approximation 

this problem we introduce a new method to determine the period of the pendulum, 
using rotational motion. For those readers not familiar with rotational motion, reread 
Chapter 5. The general idea however is that for every quantity used in linear motion to 
describe the motion of an object, there is an equivalent quantity in rotational motion. 
A summary of the relationship between linear and rotational motion is given in Table 
5.1 of Chapter 5. 

Returning to the force diagram of Figure 6.2(b) we see that a gravitational force of mg 
is still being exerted on the center of mass of the pendulum at an angle 9. From this, the 
torque on the system is just —Imgs'mO. We can use Newton's second law in rotational 
motion to set torque equal to moment of inertia times angular acceleration, with 
the equivalent relation in linear motion given previously by Equation 6.7. The similarities 
are striking. 

la = —Imgs'mO = —k9 =4> mglsmO = k9 (6.10) 

We again make the small angle approximation and solve for k (our rotational motion 
equivalent of k). 

k = mg£ (6-11) 

Substituting k for k, and / for m in Equation 6.9 we now have the solution for the 
period of any pendulum with moment of inertia I and distance i between the pivot of 
the pendulum and the center of mass of the pendulum. 



r = 2 "\fe < 6J2 » 

For the case of the simple pendulum the moment of inertia is just £ 2 m which when 
plugged back into Equation 6.12 returns the same period as Equation 6.9. 

6.4 Small Angle Approximation 

One final issue to discuss is the validity of the small angle approximation made for both 
the simple and compound pendulums. As stated earlier, this approximation is only 
valid for fl C 1, usually 9 < 10°. However, it is important to understand how this 
approximation affects the results of our theory. A closed analytic solution to Equation 
6.2 when replacing kx with mg sin x is not currently known, but can be expressed by a 
perturbation series in 9. 4 




i + g(^W 2 "W2) 



(6.13) 



4 Nelson, Robert and M. G. Olsson. "The pendulum - Rich physics from a simple system". American 
Journal of Physics 54 (2). February 1986. pp. 112-121. 
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6 Pendulum 

It is not important to necessarily understand the above equation, but is important to 
understand the results given in Figure 6.3(a). Here a convenient length for the simple 
pendulum has been chosen, £ = j%, such that the period for the pendulum as given 
by Equation 6.9 is just one second. For initial displacement angles less than 16° the 
percentage error caused by making the small angle approximation is less than 1%. As 
9 grows, the error from the small angle approximation grows as well, with an initial 
displacement angle of 90° causing nearly 50% error. 

6.5 Experiment 

For the experiment associated with this chapter a compound pendulum is used to validate 
Equation 6.12 and to experimentally measure a value for g. For the purposes of this 
experiment we assume the width of the pendulum is much less than that of its length 
L, and so we can approximate the moment of inertia for the pendulum as that of a rod. 
The moment of inertia for a rod rotating around its center of mass is, 

mL 2 

as derived in Chapter 5. Using the parallel axis theorem we see that a rod rotating 
about an axis at distance £ from its center of mass has the following moment of inertia. 

mL 2 „ 9 

I= — +m£ 2 (6.15) 

Plugging this back into Equation 6.12 we can theoretically predict the period for the 
compound pendulum used in this experiment. 




/ TTl T /?0 

T = 27F V ^rngT = 27T \I^ + ~~ = 2 «\l-~\l 1 + ^2 (6-16) 

Notice that this formula is dependent on both £, the distance of the pivot from the center 
of mass of the rod (in this case just the center of the rod), and L, the entire length of 
the rod. It is important to not confuse these two quantities. 

Using Equations 6.9 and 6.16 we can compare the periods of oscillation for a simple 
pendulum and a compound pendulum made from a rod versus the variable length of 
either pendulum, £. For the simple pendulum we expect a simple parabola from the 
square root term. The simple pendulum behavior is shown by the solid line in blue in 
Figure 6.3(b). The change in period with respect to £ for the rod compound pendulum 
is more complicated as is apparent from Equation 6.16. For a rod of L = 1.2 m (close 
to the length of the compound pendulum used for the experiment associated with this 
chapter), the behavior of the period with respect to £ is plotted in dashed red in Figure 
6.3(b) as well. 

Perhaps one of the most noticeable aspects of the comparison between the two periods 
is that for the simple pendulum, the period approaches zero as £ approaches zero. For 
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6.5 Experiment 



the compound pendulum however, the period approaches oo as £ approaches zero. 



lim 27T4 / - 
*->+o V g 



lim 2-7T 




lim 2tt\ 



12g£ 



+00 



(6.17) 



For very large £ the periods of the simple and rod compound pendulum match. This 
can be seen mathematically by taking the limit of Equation 6.16, 



lim 2ir 




lim 2tt\ 

£-s-+oo 



-^TTo 



lim 2tt\ 



(6.18) 



which we see is exactly the same as the period for the simple pendulum. Physically this 
makes sense as well. For l>Lwe see that the physical set up approaches that of the 
simple pendulum. The rod is no longer rotating about itself, but must be attached to 
the pivot by some massless connector. As £ becomes larger and larger, the rod becomes 
more and more like a point mass. 

There is another important observation to make about Figure 6.3(b). The period of 
the rod compound pendulum is not monotonic (always increasing or decreasing) like that 
of the simple pendulum but begins large, decreases rapidly, and then begins to increase 
again. By taking the derivative of Equation 6.16 and setting this to zero we can find the 
£ which provides the minimum period. 



2 t2\ 




ne-L 



2 V3S V y/£3 (L 2 + 12F) 



2 V3<? I v^3 (p + 12F) 



L 



(6.19) 



12 



T min = V87T, 



L 



gVi2 



Calculating this out with a value of g = 9.81 ms -2 and L = 1.2 m gives a minimum 
period of T m j n = 1.7 s at £ = 0.35 m for the rod compound pendulum used in the 
experiment associated with this chapter. 
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6 Pendulum 
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Figure 6.3: The period of a simple pendulum with £ = -^ versus initial displacement 

angle 9 without making the small angle approximation is given on in the top 

plot. A comparison between the periods of a rod pendulum with L = 1.2 m 

and the simple pendulum is given in the bottom plot. 



7 Gas 



Understanding how gasses interact with their environment has many connections to 
every day life, ranging from important matters such as how the earth's atmosphere is 
contained (so that we can breath), to more mundane matters (yet still very important) 
such as how quickly it takes flatulence to spread through a closed room. 

Before diving directly into the theory of gasses, a little history is necessary to under- 
stand the beginning of the fields of thermodynamics and statistical mechanics, both 
of which are closely tied with modeling the behavior of gasses. In 1660 Robert Boyle 1 
published a book with the rather long name of New Experiments Physico-Mechanicall, 
Touching the Spring of the Air, and its Effects. In this book Boyle described a series of 
experiments that he had done, with the help of his colleague Robert Hooke, which are 
the first known rigorous experiments regarding the behavior of gasses. Two years later, 
after some of his colleagues suggested he rewrite his book, he formulated Boyle's law 
purely from experimental observation. 

pV = k B (7.1) 

Here p is the pressure of a gas, V the volume of the gas, and kg some constant that is 
dependent upon the experimental setup. Physically speaking, the law states that the 
pressure of a gas is inversely proportional to the volume of the gas. 2 

Despite this rather important breakthrough, the field of thermodynamics languished 
for another hundred years until the arrival of Carnot and others, primarily because the 
physicists of the time were too distracted trying to build steam engines. 3 The next 
major breakthrough was made by Gay-Lussac who postulated that the volume of a gas 
is directly proportional to its temperature. 4 

V = k c T (7.2) 

This law is called Charles' Law as Gay-Lussac claimed the law was experimentally 
discovered previous to his own discovery. 5 Additionally Gay-Lussac also notice that the 
pressure of a gas is proportional to its temperature as well, and so he also postulated 
what is known as Gay-Lussac's Law. 

V = k G T (7.3) 



1 Robert Boyle was a born a native of Ireland to the 1st Earl of Cork and amongst his contemporaries 

was regarded as one of the world's leading physicists. 
2 J Appl Physiol 98:31-39, 2005. http://jap.physiology.Org/cgi/reprint/98/l/31 
3 There may be other reasons as well. 
The Expansion of Gasses through Heat. Annales de Chimie 43, 137 (1802). http://web.lemoyne. 
edu/~giunta/gaygas . html 
Gay-Lussac, while having a rather long and difficult to pronounce name, certainly seemed to be an all 
around good guy. 
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7 Gas 



Notice that the constant of proportionality here, kc is not the same as the constant of 
proportionality in Boyle's Law, ks nor the same as in Charles' Law, kc, and that each 
of these constants are entirely dependent upon the experimental setup. 

With these three laws the three macroscopic state variables of a gas, pressure, 
volume, and temperature, are connected. Of course it would be nice to have one equation 
instead of three, and this was eventually both experimentally and theoretically arrived 
at in the ideal gas law. 



pV = nRT 



(7.4) 



Here n is the number of moles of gas and R is the ideal gas constant. Figure 7.1 shows 
a three dimensional plot of the ideal gas law with the z-axis corresponding to pressure, 
the a>axis to volume, and the y-axis to temperature. The isolines in red demonstrate 
Boyle's law, the isolines in blue Gay-Lussac's law, and the isolines in green Charles' law. 




10 



Figure 7.1: The ideal gas law given in Equation 7.4 with isoline profiles for Boyle's, 
Charles', and Gay-Lussac's laws given in red, green, and blue respectively. 
The isolines for Boyle's law are given for a temperature every 300 degrees 
Celsius, the isolines for Charles' law are given for a pressure every 0.3 bar, 
and the isolines for Gay-Lussac's law are given for a volume every 0.3 cm 3 . 



The ideal gas law was originally arrived at experimentally, but eventually was derived 
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7.1 Kinetic Theory 

using kinetic gas theory. Later, as thermodynamics developed into statistical mechan- 
ics, a statistical derivation was also discovered. The statistical derivation can be a bit 
daunting without the proper background in statistical mechanics, and so the following 
section attempts to give an intuitive feel for the the theoretical origin of the ideal gas 
law through kinetic gas theory. 

7.1 Kinetic Theory 

In the previous section we briefly mentioned the pressure, volume, and temperature of a 
gas, but what exactly do these quantities mean? The volume of a gas is just the volume 
of the container in which the gas is confined. The pressure of a gas is the amount of 
force per unit area that the gas is exerting on its container. The temperature of a gas is 
technically defined as the partial derivative of the energy of the gas taken with respect 
to the entropy of the gas. This definition is not very helpful without more theory, so 
for the purposes of this chapter, we can also write temperature of an ideal gas as, 

2K , N 

T= JR (75 » 

where K is the average kinetic energy of the gas. 6 

There is one final definition that needs to be given, and that is for an ideal gas. An 
ideal gas must satisfy the following assumptions. 

1. The atoms are point-like with the same mass and do not interact except through 
collisions. 

2. All collisions involving the atoms must be elastic (with either the container of the 
wall, or between individual gas atoms). 

3. The atoms obey Newton's laws. 

4. There are a large number of atoms, moving at random speeds which do not change 
over time. 

With these assumptions for an ideal gas, and the definitions for pressure, volume, and 
temperature above, we can now begin the derivation of the ideal gas law. 

Consider a cube filled with gas as shown in Figure 7.2. Each side of the cube is of 
length L and so the volume of the cube is just L 3 . The cube is filled with N gas atoms 
each with a velocity Vi. The magnitude of the velocity, or the speed of each gas atom, 
does not change over time because all the collisions within the cube are elastic (we 



6 For those who are curious this relation can be found from the Boltzmann distribution and by 
assuming that an ideal gas has three translational degrees of freedom, corresponding to the three 
physical dimensions. 
For more information on elastic collisions read over Chapter 4. In elastic collisions two quantities are 
conserved, momentum and kinetic energy. 
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7 Gas 
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Figure 7.2: A cube with width, length, and height L filled with an ideal gas of N particles. 



used assumptions 1, 2, and 3 to arrive at this conclusion). The average of the squared 
velocities for all the gas atoms is just, 



v? + vS + 



+ 4 



N 



(7.6) 



where v 2 is the average of the squared velocities. In general, a straight line segment over 
a symbol indicates an average. Notice we have not written v 2 as this would indicate 
taking the average of the velocity and then squaring rather than squaring the velocities 
and then averaging. Using the value for the average of the squared velocities in Equation 
7.6, we are able to write the average kinetic energy of the gas. 



K 



1 



-mv- 



(7.7) 



Here, m is the mass of one gas atom. Notice from assumption 1 all the particles have 
the same mass, and so the average mass is just the mass of one gas atom. 

We can also break the average of the squared velocities into the components of the 
average velocity. 8 



v 2 + v 2 + v 2 



(7i 



By assumption 4 we know there are a large number of randomly moving gas atoms in 
the cube, and so if the cube was rotated 90° to the left with the y-axis now where the 



For readers unfamiliar with vectors, the magnitude squared of a vector is equal to the sum of the 
squares of the components. This is exactly the same as the Pythagorean theorem which states the 
square of the hypotenuse is equal to the sum of the square of each side of a triangle. 
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z-axis was, no one could tell the difference. This tells us that the average component 
velocities must be equal and so the average of each component velocity squared must 
also be equal. 

3 

v x = v y = v l =^ v 2 = 3v 2 => K = -mv 2 (7.9) 

In the second step of the equation above we have rewritten Equation 7.8, but now with 
only the y-component of velocity as all the average components must be equal. In the 
third step we have just plugged this value for v 2 into Equation 7.7. Now we can take the 
final result for the average kinetic energy in Equation 7.9 and plug this into our equation 
for temperature, given in Equation 7.5. 

?-£ <™> 

Returning to Figure 7.2, we can consider a single gas atom with a velocity v = v y . The 
atom could be (and probably is) also moving in the x and z directions, but we ignore 
that for now. From Newton's laws (assumption number 3) we know that, 

— _ Av AP . 

F = ma = m— — = — — (7.11) 

At At v ; 

where in the second step we have just written out the definition for acceleration, and in 
the third step substituted in momentum for mAv. We are using a capital P to denote 
momentum to avoid confusion with pressure, which is represented by a lower case p. 9 
Now consider what happens if the atom bounces of the wall of the cube. Because the 
collision is elastic (by assumption 2) and we assume the wall of the cube is very massive 
with respect to the gas atom, the velocity of the gas atom after the collision with the 
wall is just — v y . 10 

The initial momentum of the gas particle was mv y and the final momentum of the 
gas particle was —mv y so the change in momentum of the gas particle was AP = 2mv y . 
Every time the atom bounces of the wall, there is a momentum change of 2mv y . If 
the atom travels between the two walls of the cube, bouncing off each time, we know 
that the time between bounces is the distance traveled, 2L, divided by the velocity and 
so At = 2L/v y . Plugging AP and At into Equation 7.11 we arrive at a value for the 
average force exerted by a single gas atom. 

— ( 1 \ / v„ \ rnvl 

F = Ap U) = 2m "» (S - -r ( " 2) 

Now we can think of the total average force of TV particles with average velocity vZ. 



m 



Nv 2 



N 



V 



L 



(7.13) 



In every other chapter, p is used to notate momentum. 
10 Check this with Equation 6 from Chapter 4. Let the mass of the gas atom approach and check the 



result. 
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We can set the average total force of the particles equal to the pressure, as pressure is 
just force over area. The area here is just the area of the wall from the cube, or L 2 . 

— mNv 2 mNv 2 mNv 2 

F N =pA = pL 2 => -^L= P L 2 => p =—^L = —JL (7.14) 

If we plug in v 2 from Equation 7.10 into the result above, we obtain the ideal gas law! 
mN\ (RT 



P=[-^)[-) =► W = NRT (7.15) 

Notice we have a capital N (number of atoms) rather than lower case n (number of 
moles). This is just a matter of notation. Typically the ideal gas constant, R, is given 
in Joules per mole per Kelvin, in which case small n (moles) should be used instead of 
large N (atoms). 

7.2 Experiment 

The experiment associated with this chapter consists of experimentally verifying both 
Boyle's law from Equation 7.1 and Gay-Lussac's law from Equation 7.3. This first 
experiment is performed by pressurizing a column of air with a bike pump and taking 
volume measurements as more and more pressure is added. As the pressure increases, 
the volume must decrease according to Boyle's law. The raw data from this part of the 
experiment should look similar to the red lines in Figure 7.1, and be nearly identical to 
the lowest isoline (corresponding to a temperature of 20° Celsius). By making a linear 
plot of pressure on the x-axis and 1/V on the y-axis, it is possible to verify Boyle's law 
with a linear fit. 

Gay-Lussac's law can be verified by keeping a constant volume of gas and changing 
the temperature while measuring the pressure. In this experiment a small container of 
gas is heated, and a pressure gauge allows the temperature to be read. According to 
Equation 7.3 the temperature and pressure should rise linearly, and if a plot is made it 
should closely resemble the blue isolines of Figure 7.1. The volume of the apparatus is 
near 10 cm 3 , and so the raw data from this part of the experiment should closely follow 
the lower isolines. Because of this we expect that a large change in temperature will 
yield a relatively small change in pressure. 

While verifying Gay-Lussac's law, it is possible to experimentally determine absolute 
zero (0 Kelvin) in degrees Celsius! In all the formulas involving temperature in this 
chapter, the temperature must be given in Kelvin. Because the Kelvin scale is the same 
as the Celsius scale, except with a constant term added on, we can rewrite Equation 7.3. 



p = k G (T c - T ) (7.16) 

Here Tq is temperature in degrees Celsius, and To absolute zero in degrees Celsius. By 
formatting this linear relationship in slope-intercept form we see that the intercept of 
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the plot is just, 

b = k g T = mTo (7.17) 

where we already know k g from the slope of the plot. Plugging in values for m and b as 
obtained from the best fit of the plot, we can solve for To and find absolute zero! 
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Electricity has captivated the imagination of humans since the earliest recordings of 
civilizations, yet it was not until the 17 century that we truly began to understand 
the fundamental nature of electricity. Now, electromagnetism is described by the most 
precise theory ever developed by the physics community. What began with experiments 
ranging from flying kites in thunderstorms to catching electric eels has now culminated 
in providing the base for all modern technology. The study of electromagnetism today 
can be broadly broken down into applied electrical engineering and theoretical electro- 
dynamics. 

Despite being united by electricity, these two fields of study are completely different, 
yet equally important. The engineering aspect allows for new technologies to be de- 
veloped, while the the theoretical side develops new ideas that can be implemented in 
practice. However, both of these fields require an understanding of the fundamentals of 
electrodynamics. There are a large number of books on the subject; some are good while 
others are terrible. Two particularly excellent resources are MIT's OpenCourseWare ma- 
terials for 8.02 as taught by Walter Lewin, and an Introduction to Electrodynamics by 
David Griffiths. More in depth discussions of certain parts of electrodynamics are given 
in Chapters 9 and 13. 

8.1 Circuits 

A quick review of the quantities used in electric circuits is given in Table 8.1. Here 
Q indicates charge, t time, U energy, and <£ magnetic flux. The two most commonly 
measured quantities of any circuit are current and voltage. Current is the amount 
of charge crossing a point in a circuit per unit time. For the purposes of this chapter 
we will assume that current always remains constant, hence writing Q/t rather than 
dQ/dt. Both inductors and capacitors have a time dependence, unless at equilibrium, 
and subsequently will only be touched on briefly in this chapter. 

Voltage is the electric potential at a point in a circuit per unit charge. Essentially, cur- 
rent can be equated to the width of a river while voltage can be compared to the velocity 
at which the river is flowing. This comparison does not quite match, but sometimes can 
be useful for thinking intuitively about circuits, as electricity behaves like water in many 
ways. The current of a circuit is measured with an ammeter 1 while voltage is measured 
with a voltmeter. The symbol for both of these devices in an electric circuit is given in 
Table 8.1. 



A galvanometer is a specific type of ammeter, a significant step up from the original ammeter which 
was the scientist shocking him or herself and trying to gauge the power of the shock by how much it 
hurt. 



71 



8 Resistance 



Quantity Definition Symbol Units 



Base Units 



current 


I = 


Q 

t 


<A> 


amperes [A] 


voltage 


V = 


u 


<v> 


volts [V] 


resistance 


R = 


u 
' 1 


-n/W- 


ohms [Q] 


capacitance 


C = 


Q 

' V 


-\\- 


farads [F] 


inductance 


L = 


" 7 


mrv 


henries [H] 



coulombs 

seconds 

joules 

coulombs 

joules 
seconds 

coulombs 2 
joules 

webers • seconds 
coulombs 



Table 8.1: A review of basic quantities used to describe circuits. 



There are three more important basic components of circuits (ignoring transistors) 
which are resistance, capacitance, and inductance. Resistance is the amount of en- 
ergy dissipated by a component of a circuit per unit time. Oftentimes resistors dissipate 
their energy through heat, but can also emit through light. Capacitors and inductors 
provide the exact opposite purpose of resistors in circuits; rather than dissipating energy, 
they store energy. Capacitors store energy by creating an electric field and so capaci- 
tance is given by the charge stored on the capacitor divided by the voltage gap across 
the capacitor. 

Inductors store potential energy in the form of a magnetic field, oftentimes created 
by electric current flowing. The inductance of an inductor is given by the magnetic 
flux (change in magnetic field) divided by the current flowing around the magnetic field. 
Inductors are usually small solenoids which consist of many turns of wires wrapped 
around a cylindrical core. In a case like the solenoid, the current must be divided by 
the number of times it circles the magnetic field and so the inductance for a solenoid is 
usually given as N&/I where N is the number of turns within the solenoid. The idea of 
inductance is explored in more detail in Chapter 9. 

Current, voltage, and resistance are all connected through Ohm's law which states 
that voltage is just current times resistance. 



V = IR 



U) 



Additionally, the power dissipated by a resistor is equal to the square of the current 
running through the resistor times its resistance. 



P 



I 2 R 



1.2) 



While Ohm's law seems extraordinarily simple, it is the basis for simple circuits, and 
the underlying theory is very involved on a microscopic level. 
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V s 



+ 



■WVA- 

Ri 



R-2 



Rz 



(a) 



Vs 




(b) 

Figure 8.1: Examples of a series and parallel circuit given in Figures 8.1(a) and 8.1(b) 
respectively. 



73 



8 Resistance 

Ohm's law in the form of Equation 8.1 is only useful for determining the current, 
voltage, or resistance of a simple circuit consisting of a power source and a resistor. 
However, more complex circuits can be broken into two categories, series circuits and 
parallel circuits. In a series circuit all the electrical components are placed one after 
another on a single electrical path as shown in Figure 8.1(a). For this type of circuit, 
resistances can be added together to find a total resistance, 

i?total = Ri + #2 + Rs + ■ ■ ■ (8.3) 

whereas the inverse of capacitances must be added. 

1111 , s 

+ 7T + 7T + • • • (8-4) 



Ctotal C\ C2 C3 

Inductors in a series circuit are added just like resistors. 

In a parallel circuit a single electrical path breaks into multiple electrical paths, and 
then combines back into a single electrical path, like the diagram given in Figure 8.1(b). 
For parallel circuits the inverse of the resistances and inductances must be added to find 
the total resistance or inductance, 

1111 , x 

+ — + — + ••• (8.5) 



-Rtotal R\ Ri R3 
while the capacitances may just be added. 

Qotai = Ci + C 2 + C 3 + • • • (8.6) 

8.2 Kirchhoff 's Laws 

Equations 8.3 through 8.6 are not fundamental laws themselves, but rather, can be 
derived from what are known as Kirchhoff 's laws which are given below. 

1. Conservation of charge. All the current flowing into a junction must equal the 
current flowing out of the junction. 

2. Conservation of energy. The net voltage of any loop within a circuit must be zero. 

Both of these laws have an even more fundamental basis from Maxwell's equations, 
but for the purposes of this chapter, let us accept the two laws above without further 
derivation. In Chapter 13 Maxwell's equations are introduced. 

But how do we apply these laws to circuits? Let us first take the series circuit of 
Figure 8.1(a) as an example. The first step when using Kirchhoff 's laws is to ensure that 
all currents and resistances are labeled. The currents must all be labeled with arrows; 
the direction of the arrow does not matter, as long as it is maintained consistently 
throughout the application of the laws. In Figure 8.2(a) all the resistances and currents 
have been labeled and all currents have been given a direction. 
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8.2 Kirchhoff's Laws 

The next step is to apply Kirchhoff's first law to every junction in the circuit. The first 
junction in this example occurs at the upper right hand corner of the circuit between 
currents I\ and I<i- The current I\ is going into the junction (the arrow is pointing to the 
junction) while I2 is leaving the junction (the arrow is pointing away from the junction). 
Using the first law we have I\ = I<i- Similarly we can apply the same logic to obtain 
I2 = -^3 and consequently I\ = I2 = !■$■ 

Now Kirchhoff's second law can be applied to the circuit. To apply the second law, 
locate all the loops within the circuit and apply the law to each loop individually. In this 
example there is only one loop and so our job here is simplified. For each loop start at 
any point on the loop and trace around the loop. For each resistor crossed going in the 
direction of the current subtract the current times the resistance at that point. For every 
resistor crossed going in the opposite direction of the current, add on the current times 
resistance. For every power supply crossed going in the direction of the current, add on 
the power supply voltage. For every power supply crossed in the opposite direction of 
the current, subtract that voltage. 

After the loop is finished, equate all of these values with zero. For the example of 
the series circuit we begin in the upper left hand corner and first cross R\ with current 
I\ in the direction we are moving and so we must subtract I\R\. Next we across R2 
with current I2 so we must subtract 12^-2- Crossing R3 we must again subtract /3-R3 
and finally we cross the power supply in the direction of the current so we add on V s . 
Putting this all together and equating to zero gives us the following, 

V s -hR 1 -I 2 R 2 -hRz -► V s = h(R 1 + R 2 + R 3 ) (8.7) 

where in the second step the relation between the currents obtained using Kirchhoff's 
first law was applied. From the second step we see that we have arrived at Equation 8.3! 

The above example is somewhat trivial, but it is important to be careful about skipping 
steps when using Kirchhoff's laws. Oftentimes shortcuts can seriously reduce the time 
needed to complete a problem, but mistakes can easily creep in without notice. By 
following the laws down to even the trivial steps, these mistakes can oftentimes be 
avoided. 

Now let us consider the slightly more difficult case of the parallel circuit in Figure 
8.1(b). Again we begin by drawing current labels with associated directions as is done 
in Figure 8.2(b). Applying Kirchhoff's first law we obtain, 



h 


= h + h 


h 


= h + h 


h 


= h + h 


h 


= h + h 


h 


= h 


h 


= h 


h 


= h 







where the currents flowing in and out of each junction have been equated, beginning in 
the upper left hand corner and proceeding clockwise. 
After a little manipulation we see that 

h = h + h + h (8.9) 
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h 



Vs 




h 



\ 



(a) 



Ks 




Figure 8.2: Figures 8.1(a) and 8.1(b) with current labels and arrows added. 
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and that I5 = Is, etc. We could have come to this conclusion much more quickly be 
simplifying the diagram so that I\, I 2 , and I3 are all joined at the same points. However, 
the law was applied in full to demonstrate the method. 

Next we apply the second law. Starting in the upper left hand corner of the diagram 
and proceeding clockwise there are three possible loops, with each loop crossing V s 
and Ri, R2, or R3, all in the direction of the current. This yields the following three 
equations. 



V S = I 1 R 1 , V S = I 2 R 2 , V S = I 3 R 3 (8.10) 

Using the last relation of Equation 8.10 and substituting in ^3 from Equation 8.9 we 
obtain, 

V s = (h-I 2 -h)R 3 (8.11) 

which can be simplified further by using the first two relations of Equation 8.10 to replace 
h and I 2 . 



R\ R 2 R3 
V s = lJ ' 



(8.12) 



Ys. _|_ Vs_ + Vs_ 
R\ i?2 R3 



From this we have derived Equation 8.5. The methods for finding how to add capacitance 
and inductance together is similar, and can be explored further by the reader. 

8.3 Experiment 

In the two examples given above, the resistances have been assumed to be known, while 
the currents and consequently voltages, were unknown. This is because in general, the 
resistance of components used within a circuit are known to very high precision, and at 
standard temperature and pressure remain very stable. Current and voltage on the other 
hand can change greatly, depending upon the power supply. Power supplies typically are 
constant voltage; they always provide the exact same voltage, and change the supplied 
current as necessary. Of course, most power supplies can only supply up to a maximum 
current after which the voltage is compromised. As an example, batteries, such as the 
AA battery supply a constant voltage of between 1.2 and 1.5 V and whatever current 
is required. These batteries can only supply a very limited amount of current and over 
time the voltage degrades quickly. 

Figure 8.3 shows the discharge characteristics of a rechargeable AA battery. When a 
very low current of 1,250 mA is required the voltage remains relatively constant over 
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> 



1250 mA 
2500 mA 
5000 mA 




Figure 8.3: Voltage versus time (in hours) for a typical AA battery at varying levels of 
constant current. The data from this plot was taken from the product data 
sheet for the Energizer NH1 5-2500 rechargeable AA battery. 
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a period of about 2 hours. When a very large current such as 5, 000 mA is required 
the battery lasts only about half an hour and the voltage does not plateau. Oftentimes 
electrical devices such as cell phones provide an estimate of the battery life on screen 
by digitally measuring the voltage of the battery. However, as can be seen from Figure 
8.3, the voltage does not provide a very good indicator of remaining battery life, and so 
usually the battery estimators on cell phones are not very accurate. 

The point of the discussion above is that usually the current and voltage of a power 
supply are not known to a very high precision, so how then are we able to measure 
resistance to such high precision? One possibility is given in the circuit diagram of 
8.4(a). Here a power supply is connected to an ammeter and resistor in series, with 
a voltmeter in parallel, measuring the voltage drop. The first problem with this setup 
is that two readings are being taken, current and voltage. The second problem is that 
the leads to and from the voltmeter, along with the leads from the ammeter provide 
resistance that is not taken into consideration. 

To counter these effects, a device known as the Wheatstone bridge was developed, 
which utilizes what is known as a difference measurement. The circuit for a typical 
Wheatstone bridge is given in Figure 8.4(b). In this diagram it is assumed that the values 
for i?i and R3 are known to high precision. The resistor R2 is a variable resistor and R x 
is the unknown resistor being measured. But how does this setup help us determine the 
value for R x ? Let us apply Kirchhoff 's laws to find out. 

First, we label all the currents in Figure 8.4(b) and apply Kirchhoff's first law. For 
resistor R\ through R x consider currents I± through I x all flowing downwards. Again, 
we could have chosen to have the currents flowing upwards, or counterclockwise, or 
with whatever configuration we would like, but this configuration is the most intuitive 
physically. Now let us define I a as the current flowing across the ammeter in the center 
from right to left. We actually don't know which direction this current is flowing as this 
is dependent upon the values of the resistors, but we can arbitrarily decide it is flowing 
from right to left. Finally, we define I s to be the current flowing from the power supply 
to the top of the bridge, and the current flowing from the bottom of the bridge to the 
power supply. 

From Kirchhoff's first law we have, 

h=h+ h 

3 A (8.13) 

I x = h + h 

h = h + I a 

from starting at the top junction of the bridge and moving clockwise about the diagram. 
Now we can apply Kirchhoff's second law to the diagram. There are two loops of interest. 
The first loop is the top triangle of the bridge and starts at the top of the bridge and 
moves across R3, then the ammeter, and back to the top across R\. The second loop is 
the bottom triangle of the bridge and starts at the far right point and crosses R x , then 
i?2, and returns across the ammeter to the far right point. Applying Kirchhoff's second 
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law to these loops we arrive at, 

= -I 3 R 3 -I A R A + I 1 R 1 

(8.14) 
= -I X R X + I 2 R 2 + IaRa 

where Ra is the resistance of the ammeter. 

After moving the negative terms in Equation 8.14 to the left side of the relations we 
have the following. 

I 3 R 3 + I a Ra = hRi fo , c , 

(8.15) 
I X R X = I2R2 + IaRa 



Dividing the bottom relation by the top relation of Equation 8.15, 

I'xR-x _ I2R2 + IaRa 
I 3 R 3 + IaRa ~ hRi 



(8.16) 



gives us a ratio between the resistance and currents. 

Taking the second and last relation of Equation 8.13 we can now substitute out I x 
and I\ from Equation 8.16. 

(I 3 - I A ) R x _ _ I2R2 + IaRa , g u) 



I 3 R 3 + I a Ra (h + I a) Ri 

From this we now see something interesting, although it may not be obvious at first 
glance. If the current flowing across the ammeter is zero, Equation 8.17 becomes, 

HRX _ hR2 RX _ R2 jy __ R3R2 /„ , „X 

-^3-^3 I2R1 R 3 R\ R\ 

and we can find the unknown resistance R x if we know all the other resistances! This 
is how the Wheatstone bridge works. The variable resistor is changed until the current 
flowing across the center of the bridge becomes zero. Once this occurs, the unknown 
resistance can be determined using the other three known resistances. 

In the experiment associated with this chapter the resistance for a piece of wire is 
determined along with the resistivity constant, p, for the metal the wire is made from. 
For a conductor through which direct current is running, the resistance is, 

R= P -\ (8.19) 

A 

where p is the resistivity constant of the resistive material, £ is the length of the material, 
and A is the cross-sectional area of the material. 2 Subsequently, if we measure the total 
resistance for a length of wire, along with the length of the wire, and the width of the 
wire, we can determine p. 

The apparatus used to determine the resistance of the length of wire is the same as 
that of Figure 8.4(b) but instead of using a variable resistor for R2 and a known resistor 



"For wires we assume they are circular and so A becomes nr where r is the radius of the wire. 
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V s 




(a) 



V, 




(b) 



Figure 8.4: Two methods for measuring resistance. In Figure 8.4(a) the current and 
voltage are measured and the resistance is determined using Ohm's law. In 
Figure 8.4(b) a Wheatstone bridge is used. 



8 Resistance 

for R\ we take advantage of Equation 8.19 and replace both with a single length of wire. 
Consequently, the resistances of R\ and i?2 become, 

*,_* K 2 = *f (8.20) 

which can be substituted back into Equation 8.18. 

Rx = ^ (8.21) 

The left side of the bridge becomes a wire with a fixed length, and the ammeter is 
connected to it with a moveable connection. The connection is moved up and down the 
wire until the ammeter indicates zero current flow, and the lengths £\ and £2 are then 
measured. From these lengths R x is determined using Equation 8.21! 
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Whenever electricity is discussed, such as in Chapter 8, the basic components of an 
electric circuit are usually discussed: resistors, capacitors, measuring tools, and power 
supplies. But oftentimes another basic electrical component, the inductor, is ignored, or 
at least given only a brief explanation. This is not because inductance is unimportant, 
but because the theory behind inductance can be more complicated than resistance or 
capacitance. 

Explaining inductance is not simple, so let us begin with the explanation of inductors, 
given in Chapter 8. A resistor is a component in a circuit that dissipates energy, whether 
through heat, light, or some other energy transfer mechanism. Capacitors and inductors 
are the opposites of resistors, and store energy, rather than dissipate it. In the case of a 
capacitor, the energy is stored in an electrical field, while in the case of the inductor, the 
energy is stored in a magnetic field. This is the core idea behind inductance, it connects 
electricity with magnetism into electromagnetism. 

Inductance plays an important role in everyday life. Without inductance, radios and 
televisions, AC power transformers, electrical motors, all would not work. Perhaps most 
importantly, electricity would no longer be available, as the concept by which all electrical 
generators operate is inductance. To begin understanding the theory behind inductance 
we first need to understand the fundamental interaction between charged particles and 
electromagnetic fields 

9.1 Lorentz Force 

The Lorentz force describes the force felt on a charged particle, such as an electron, 
as it passes through electric and magnetic fields. From intuition, we know that placing 
two objects with the same charge next to each other causes the objects to be repelled. 
This force due to the electric fields of the two objects is called the Coulomb force. But 
what happens if a charged object is placed into a magnetic field? Does the object feel a 
force? The answer is, it depends. 

If the object is not moving, it does not feel a force. However, if the object is moving, 
it feels a force proportional to the charge of the object, the strength of the magnetic 
field, and the velocity at which the object is moving in the magnetic field. At this point, 
perhaps the obvious question to ask is, why does the force on the object depend upon 
its velocity? The answer can be very complicated but the following explanation, while 
greatly simplified, will hopefully shed some light. 

The idea in electromagnetism is that electric fields and magnetic fields are actually the 
same thing, it just depends upon which reference frame the electric or magnetic field 
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is observed in. This is a consequence of relativity, which while an incredibly beautiful 
theory, will not be discussed here! 1 For example, take a common bar magnet. If we 
set it on a table, we will just observe a pure magnetic field. However, if we run by the 
counter top, we will begin to observe an electric field, and less of a magnetic field. This 
is why the force on a charged particle from a magnetic field is due to the velocity of the 
particle. As soon as the particle begins to move, it starts to see an electric field and 
begins to experience Coulomb's force. 

So enough qualitative discussion and time for an equation. The Lorentz force is 
expressed mathematically as, 

F = qE + qvxB (9.1) 

where q is the charge of the object, v is the velocity of the object, E is the electric field, 
and B is the magnetic field. Notice that everything here is a vector (except for charge)! 
That is because all of the quantities above, force, velocity, electric field, and magnetic 
field have direction. 2 

It is important to realize that the second term in Equation 9.1 is the cross product 
of the velocity of the charged object with the magnetic field, or vBsm.0 where 9 is the 
angle between the velocity vector and the magnetic field vector. This means that if 
the charged particle is moving in the same direction that the magnetic field is pointing, 
it experiences no force, whereas if the charged particle is moving perpendicular to the 
magnetic field, it experiences a force of qvB where v and B are the magnitudes of v and 
B. 

Let us consider a simple example that requires the application of Equation 9.1 and 
the Lorentz force. Consider a particle with positive charge q moving through both a 
constant electric field and magnetic field, as depicted in Figure 9.1. The directions of 
the fields are important, and in this diagram the electric field is pointing down the page, 
while the magnetic field is pointing into the page. The particle is traveling from left to 
right across the page with velocity v. If the electric field has a magnitude of E, what is 
the magnitude of the magnetic field, B, required so that the particle moves in a straight 
line? 

If the particle moves in a straight line, we know from Chapter 3 and Newton's first 
law that the external force acting on the electron must be zero. 

F = = qE + qvxB (9.2) 

Next, because the particle has a positive charge of +q we know that the electric field is 
exerting a force downward of qE, where E is the magnitude of the electric field. The 
magnetic field is perpendicular to the velocity of the particle, and so we know that the 
force being exerted on the particle from the magnetic field is qvB. 



x For intrepid readers who would like to read more, I would suggest A. P. French's book Special Relativity. 

Perhaps at this point the astute reader will then ask, why isn't the first term also dependent upon 

velocity? If magnetic fields transform into electric fields, shouldn't magnetic fields transform into 

electric fields? The answer is yes, and the explanation given above really is not correct, but gives the 

general idea of what is happening without spending an entire book on it. 
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E 



I 



B <E/v 



+q 



B > E/v 



B 



Figure 9.1: An example of the Lorentz force where a charged particle is traveling through 
both an electric and magnetic field. If B = E/v then the particle travels in 
a straight line. If B > E/v the particle curves upwards and if B < E/v the 
particle curves downwards. 



But what is the direction of the force from the magnetic field? From the right hand 
rule, we know that the direction of the force from the magnetic field must be upward. 3 
This means that the force from the electric field and the magnetic field are in opposite 
directions and so, 



qE = qvB 



(9.3) 



which gives a value of B = E/v for the magnetic field. 

What happens if B ^ E/v? Now the particle does experience a net force, and so it will 
no longer travel in a straight line. The upper dotted path in Figure 9.1 illustrates the 
trajectory of the particle if B > E/v and the lower dotted path illustrates the trajectory 
of the particle if B < E/v. This technique of controlling the trajectory of a charged 
particle with a certain velocity using electric and magnetic fields is common in particle 
physics (and even devices such as a mass spectrometer) and is called a velocity selector 
as only a particle with the proper velocity can pass through without being deflected. 

9.2 Biot-Savart Law 

We now know how a magnetic field can effect an electrically charged object through the 
Lorentz force, but can the opposite happen? Can an electrically charged object effect a 
magnetic field? The answer is yes, and the phenomena is described by the Biot-Savart 



The right hand rule is a useful tool for determining the direction of cross products like in Equation 
9.1. Using your right hand, point your fingers along the direction of the first vector (in this case v). 
Next bend your fingers at the knuckles so that they point in the direction of the second vector, or B. 
Now look at the direction that your thumb is pointing. This is the direction of the cross product of 
the two vectors. The right hand rule is easier to see in action than have described on paper. 
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law. The Biot-Savart law (conveniently abbreviated the B.S. law) states that any charge 
flowing in a loop will produce a magnetic field. While the Biot-Savart law might sound 
new, it is a physics phenomena that is encountered daily. A prime example of the law in 
action is in the operation of electromagnets which are used in everything from electric 
motors to audio speakers. 

An electromagnet usually consists of a ferromagnetic material, wrapped in coils of 
wire. A ferromagnetic material is just a material that can be magnetized by an external 
magnetic field, but after the magnetic field is removed, it slowly loses its magnetization. 
For example, a needle can be magnetized using a bar magnet, but slowly over time the 
needle demagnetizes. This magnetization of the needle occurs because the magnetic 
dipoles of the molecules line up in the magnetic field, causing a net magnetic field, but 
slowly, over time, come out of alignment due to random movements of the molecules in 
the needle. 

Back to the principle driving an electromagnet. Charge runs through the coils of wire 
surrounding the ferromagnet, and because the charge is flowing in a loop, a magnetic 
field is created. This magnetic field aligns the magnetic moments of the molecules in the 
ferromagnet, and the ferromagnet produces a net magnetic field. As soon as charge stops 
flowing through the coils of the electromagnet, the ferromagnet is no longer subjected 
to an external magnetic field, and so it loses its magnetization. 

But what is the magnetic field given by a single loop of an electromagnet? The 
Biot-Savart law is given by, 

B=&^4^ (9-4) 

J 47r |r| 3 

where B is the magnetic field created by the current loop, jiq is the magnetic constant, 
/ is the current flowing in the loop, f is the vector from the current loop to the point 
where the magnetic field is being calculated, and dl is an infinitesimal length along the 
current loop. 4 Describing Equation 9.4 with just words is not that useful, so we turn to 
the diagram of Figure 9.2 to hopefully make things a little clearer. 

The current, /, is flowing around the circular loop in a clockwise direction from above. 
We wish to calculate the magnetic field at the center of the loop (x, y) = (0, 0). 5 We split 
the loop into infinitesimal pieces of size dl, with one of these pieces indicated in bold on 
the diagram. Each of these pieces has a direction which points along the direction of the 
current. In the example here, the direction of dl is straight upwards in the y direction 
(tangential to the circular loop). We need to take the cross product of dl with the vector 
r, which points from dl to the center of the circle. 

Because dl is perpendicular to r (a nice property of circles), the cross product of 
the two vectors is just rdl and points into the page by the right hand rule. From the 



4 The § symbol is the mathematical notation for a path integral. For example, if we are finding 
the magnetic field from a square with current running through it, the integral of Equation 9.4 is 
performed over the path traced out by current passing through the square. 

' Of course this problem could be done in three dimensions, but this makes the example slightly easier 
to follow. 
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Figure 9.2: An example of the Biot-Savart law. Current is flowing clockwise through the 
loop, and consequently creating a magnetic field. 



infinitesimal portion of the current loop, a magnetic field of strength, 

is created, where —k indicates the vector is pointing along the negative z direction, or 
in this case, into the page. 6 To find the total magnetic field at the center of the loop 
we now need to add up all the magnetic fields contributed from the infinitesimal pieces, 
dl. 

Luckily, it turns out that wherever dl is located on the circle, it always provides 
the same contribution to the magnetic field, Equation 9.5. This means that we can just 
multiply Equation 9.5 by 2nr (the circumference of the circle) to perform the integration 
of Equation 9.4. This gives us a total magnetic field of, 

B = ^r(-k) (9.6) 

again, pointing into the page. 

Actually, it turns out that it was not luck that all the contributions from dl were the 
same. This example was given because the answer is relatively simple to calculate. If 
we had tried to find the magnetic field at the point (x, y) = (0, r/2) the problem would 
have become much more complicated. Now the magnetic field contributions from dl 
are all different, and even more importantly, they don't all point in the same direction. 
Calculating out the magnetic field this way is possible, but very tedious. 

9.3 Lenz's and Faraday's Laws 

Up to this point we have looked at constant electric and magnetic fields. But what 
happens if we look at a changing magnetic field? This requires the introduction of both 
Lenz's law and Faraday's law. Lenz's law states that if a magnetic field is passing 



For those unfamiliar with hat notation, the unit vectors i, j, and k point in the x, y, and z directions 
and have a length of 1. 
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through a loop made of some conductor, and the magnetic field changes, a current will 
be produced in the conductor that creates a magnetic field (through the Biot-Savart law) 
which tries to keep the magnetic field the same. If this concept sounds somewhat familiar, 
it should. Looking all the way back to Chapter 3, Newton's second law essentially states 
that objects like to stay in the state that they currently are in. A block at rest does not 
want to move, and a block that is moving does not want to stop moving. 

Lenz's law is exactly the same idea, but now dealing with magnetic fields. A conductor 
loop in a magnetic field wants the magnetic field to stay the same, and if the field changes 
the conductor tries to compensate by creating its own magnetic field. The idea is very 
much like inertia. Just as it is difficult to bring a block to a stop, it is difficult to change 
a magnetic field surrounded by a conducting loop. The process by which a current is 
created within the conductor is called induction. 

Faraday's law takes Lenz's law and adds some math behind the concept by stating 
that the induced voltage in the conducting loop is equal to the derivative of the magnetic 
field passing through the loop with respect to time, or, 



V 



d(B ■ A) 



dt 



(9.7) 



where V is the voltage induced in the conducting loop, B is the magnetic field passing 
through the loop, and A is the area surrounded by the loop through which the magnetic 
field passes. It is important to notice that the area of the loop is a vector, not just a 
number. The magnitude of A is A, but the direction is important as well, because the dot 
product between B and A needs to be taken. The direction of A is the direction of the 
normal to the surface A. What this means is that A points in a direction perpendicular 
to the surface which it represents. 

Again, it may be simpler to explain Equation 9.7 with an example. Consider a circular 
loop of some conductor with a setup similar to Figure 9.2, but now a magnetic field is 
passing through the loop. First, what happens if we consider a magnetic field that does 
not change over time? What is the current induced in the conducting loop? The answer 
is zero, because the derivative of a constant is zero. The magnetic field is not changing, 
and so the conducting loop does not need to compensate to keep the magnetic field 
constant. Looking at a slightly more complicated example, let us consider a magnetic 
field given by, 

Bit) = ^k (9.8) 

where Bq is some constant, and B is pointing out of the page, while decreasing over 
time. 

To find out what the voltage induced in the circuit is, we must take the dot product of 
B with A. The magnetic field is pointing out of the page, as is A, and so the dot product 
of the two is BoA/t 2 . Taking the time derivative of this yields the induced voltage, 

2irr 2 B , N 

V = (9.9) 



9.4 Dipoles 

which will create a magnetic field through the Biot-Savart law. By Lenz's law we know 
that the magnetic field will compensate for the loss of B(t) and so we know that the 
current in the conducting loop must be flowing counterclockwise. If the resistance R 
of the conductor was known, we could calculate the self-induced magnetic field using 
Ohm's law (Equation 8.1) to determine I from Equation 9.9, and plug this into Equation 
9.4. 



9.4 Dipoles 

Equation 9.1 allows us to calculate the force felt on a charged object, but requires 
knowledge of the electric field, E, the magnetic field, B, the charge of the object, and 
the velocity of the object. These last two quantities, charge and velocity should already 
be familiar, but how do we determine the electric and magnetic fields? The electric field 
from a single charge, such as an electron is given by, 

-> of 

E = ir — 2 ( 9 - 10 ) 

47re r 2 

where q is charge, eo is the electric constant, and r, just as before, is the distance from 
the charge. Notice that the electric field, E, is a vector quantity, and has a direction. 
The direction of E is given by the only vector quantity on the right hand side of the 
equation, f, which has a length of one, and points from the electric charge to the point 
where E is being measured. This type of vector which has a length of one is called a 
unit vector. 

There are two important points to notice about Equation 9.10. The first is that if an 
observer is measuring the electric field, the direction of the field will always be pointing 
directly towards or away (depending upon the electric charge) from the charged object. 
The second point is that the electric field falls off quadratically as the distance r between 
the observer and the object increases. For example, if an observer measures the electric 
field from an object at one meter, and then at two meters, the electric field will have 
been reduced by a factor of four. In the remainder of this section, we explore slightly 
more complicated electric fields from multiple charged objects, and then extend this to 
magnetic fields. The math can be very involved, so even if the following part of this 
section seems incomprehensible, remember the two points above. 

Equation 9.10 gives the electric field in polar coordinates, but what if we would like 
the electric field in Cartesian coordinates? 7 We can simply write r and f in Cartesian 
coordinates. The distance r is given by the Pythagorean theorem, r 2 = x 2 + y 2 . The unit 
vector f, however, is a bit trickier. We must split f into the unit vector in the x direction, 
i, and the unit vector in the y direction, j. This gives us the vector xi + yj, which is in 
the same direction as f, but this is not a unit vector, because it has length r. If we divide 
by r, then we obtain a vector with length one, which gives us r = (xi + yj)/y x 2 + y 2 . 
Plugging in the values we have found for r and f yields Equation 9.11, which is the 



7 For those readers unfamiliar with polar and Cartesian coordinates, briefly read over Chapter 5. 
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P = (x,y) 
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Figure 9.3: A diagram of a dipole (either electric of magnetic), where the dipole is aligned 
along the y-axis, the dipole charges are separated by a distance of I, and the 
field is being measured at point P. 



electric field from a single charge in Cartesian coordinates. 



E 



xi + yj 



4-7reo 



( x 2 + y 2)3/2 



(9.11) 



This is convenient, because now we can write the electric field in terms of its y com- 
ponent, E y , and its x component, E x . Notice that these are no longer vector quantities, 
because we already know which direction they point. If we place an object with charge 
q at the coordinates (0, 0), then the electric field along the y direction at the point (x, y) 
is given by, 



E,, 



4vre ( x 2 + y 



2A3/2 



(9.12) 



which was obtained by taking only the terms of Equation 9.11 which were multiplied by 
j. We could perform the exact same step for E x , but let us stick with the E y for now. 

Consider Figure 9.3, where there are two charged objects, one with charge +q and one 
with charge —q, placed along the y-axis. We call this configuration of electric charges 
an electric dipole. 8 How do we calculate the electric field from this configuration? 



3 Each charge is considered a "pole" and their are two, so it is called a dipole. 
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For this we need to use the idea of linear superposition, which states that the total 
electric field from a collection of charged objects can be found by just adding all the 
individual electric fields for each charged object, given by Equation 9.11. The term 
superposition just means that the fields can be added, and linear means that they can 
be added without having to perform some mathematical operation on each field. 9 

Returning to Figure 9.3, at the point P = (x,y), the electric fields in the y direction 
for the positive and negative charges are given by, 





+q y- 2 


-q y + i 


4 "°(* 2 +fo+i>T /2 



(9.13) 



where Ey is the electric field from the positive charge and E is the electric field from 
the negative charge. Adding these two electric fields together, 

Empale = E + + E - (914) 

yields the y component of the electric field for both charges. Unfortunately, this is where 
the math starts to get a little more complicated. 

We would like to add Ey and E~ together, but their denomitators are not the same, 
differing by a y — 1/2 in Ey and a y + 1/2 in E~ . To get around this problem we use 
what is called a Taylor expansion which allows us to approximate a function about a 
given value for a variable. 10 The Taylor expansion for the function f(x) is given by, 

/(,) « /(«,) + ^^ + ^ {X ~ 2 f + • • • (9-15) 

where xo is the value for the variable x about which we are expanding. 11 

Looking at Figure 9.3 again, we can imagine looking at the charge configuration from 
very far away. When we do this, the distance £ approaches zero, at least from our 
viewpoint. Of course the absolute distance between the two charges stays the same, but 
to us, the distance looks much smaller. What this means is that we can approximate the 
denominators of Ey and E~ by performing a Taylor expansion around the point £ = 0. 



9 A good example of something that does not normally obey linear superposition is Gaussian uncertainty, 
as discussed in Chapter 1. 

Taylor expansions were used earlier in Chapter 1 to determine how to combine normal uncertainty. 

There is a lot more to Taylor expansions then just giving the definition, but unfortunately understand- 
ing them more fully requires more detail which is not relevant to this discussion. 
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Applying Equation 9.15 to the denominators of Equation 9.13 yields, 
1 1 3y£ 



+ (y±f) 2 ) 3/2 {* 2 + y 2 f' 2 2(x 2 + y 2 f 2 



3(4y 2 -x 2 )£ 2 (9-16) 

8(x 2 + y 2 ) 7/2 



expanded out to a term quadratic in £, or I 2 . We can actually ignore the term with £ 2 , 
because £ ~ and so this number will be very small. 

Using the first two terms of 9.16 to approximate the denominators of Equation 9.13 
and plugging this back into Equation 9.14 yields, 



pdipo 



le 



3% 



2 



47re \ (x 2 + y 2 f 2 (x 2 + y 2 f /2 J 

q ( £(3y 2 -x 2 -y 2 ) \ 
Awe { ( x 2 + y 2f 2 ) (9.17) 

p ( 2y 2 — x 2 



4"7reo \(a; 2 + y 2 ) 5 / 2 



which is the y component of the electric field from a dipole. In the final step we have 
made the substitution p = q£ which we call the electric dipole moment. 

While the result above is interesting, one might wonder, why did we go to all this 
trouble? The reason is that magnets cannot be split into individual "poles". They 
always consist of a north pole and a south pole. 12 What this means is that we cannot 
use Equation 9.10 to represent a magnetic field from a typical magnet, but rather an 
adjusted form of Equation 9.17. Of course we would also like to know E x and E z if we 
were working in three dimensional space, but these can be determined using the exact 
same method above. 

The magnetic field from a magnetic dipole is given by, 

^ dipole = T ( t 2 f~2 X L ) ( 9 - 18 ) 

y 4tt \(x 2 + y 2 f' 2 J 

where ^o is the magnetic constant and /j, is the magnetic dipole moment for a specific 
dipole. Notice that only the substitutions eo — > l//^o an d p — > H have been made. 
Remember that the dipole must be aligned along the y-axis for both Equations 9.17 and 
9.18 to be valid. 

Now we have an equation for the electric field from a single charge, a dipole, and know 
how to calculate the electric field for multiple charges using linear superposition. We also 



12 If you can show that a magnet can be split, or that a magnetic monopole actually exists, you would 
win the Nobel prize. 
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Figure 9.4: The field lines of a dipole drawn around the dipole setup of Figure 9.3. If a 
charge is placed on one of the field lines it will begin to move along the line. 



m 



9 Induction 

have an equation for the magnetic field from a typical magnetic dipole. But oftentimes 
equations, while useful for calculations, do not give a physical intuition for the problem. 
Consider what would happen if another free electric charge was placed into the electric 
field from an electric dipole. The charge would feel a force from the electric field, given 
by the Lorentz force, and would begin to move. 

In Figure 9.4 the dipole is drawn in red and blue, and drawn around the dipole are 
field lines. These lines do not represent the strength of the electric field! 13 Instead, 
the field lines represent the direction the electric field is pointing. Physically, if the free 
charge was placed on a field line, it would begin to move along the field line. The exact 
same idea can be applied to a magnetic dipole, but now we can no longer use a single 
free charge to map out the field lines, and instead must use another dipole. If a piece of 
paper filled with iron filings is placed over a common bar magnet, the filings will align 
along the field lines and create the dipole field line pattern seen in Figure 9.4. 

9.5 Experiment 

A large amount of material was presented above, and while the details may have escaped, 
hopefully the general concepts remain. Specifically, electric charges feel a force from 
electric fields and magnetic fields described by the Lorentz force, loops of current create 
magnetic fields described by the Biot-Savart law, current loops are created by changing 
magnetic fields described by Faraday's law, and most magnetic fields can be described 
by a dipole. The experiment for this chapter manages to connect all the ideas above 
into a rather simple to perform but theoretically complex experiment. 

Consider a conducting pipe (for this experiment we use copper) which has some resis- 
tance R, and some radius vq. What happens if we take a magnet with a dipole moment \x 
and drop it down the copper tube as shown in Figure 9.5? Without taking into account 
any of the theory above we would say quite simply that the magnet will fall with an 
acceleration of g ~ 9.81 m/s 2 by Newton's second law. But this is not the whole story! 
A changing magnetic field is passing through a conducting loop (the copper tube), so 
an electric current, called an eddy current, is induced within the copper tube and can 
be described by Faraday's law. The current loop within the pipe creates a magnetic 
field described by the Biot-Savart law, which produces a force on the falling magnet, 
counteracting the force of gravity. If the magnet falls fast enough, the induced magnetic 
field in the pipe will be large enough to completely counteract the force of gravity and 
the magnet will reach a terminal velocity. 

The idea of the experiment associated with this chapter is to measure the terminal 
velocity at which the magnet falls through the copper tube. While we have a qualitative 
prediction (the magnet will fall more slowly than just through air) it would be be even 
better to make a quantitative prediction for the terminal velocity of the magnet. While 
the modeling of eddy currents is very complex, we can make a reasonable estimate of the 



It is important to remember this, and even many experienced physicists forget. The field lines of a 
magnetic dipole are almost always shown instead of the actual magnitude of the force because it 
helps readers understand the direction in which the field is pointing. 
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Figure 9.5: The experimental setup for this chapter. A magnetic dipole, indicated by the 
red and blue circles, is falling through a copper tube, due to a gravitational 
force rag. The changing magnetic fields from the dipole induce eddy currents 
above and below the magnetic field (shown in red), which exert a magnetic 
force, Fb, on the magnetic dipole, counteracting the gravitational force. 
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terminal velocity using only the theory described above and a few simple assumptions. 
To begin, let us first think of the magnet falling through a single copper loop, rather 
than a pipe, and calculate the current induced within the copper loop using Faraday's 
law. 

We can approximate the magnetic field of the magnet using the dipole field of Equation 
9.18 with just a few adjustments. First, we have changed from two dimensions to three 
dimensions but this actually is not a problem. The magnetic dipole is aligned along the 
z direction, so we simply substitute z for y. Next, because the pipe is symmetrical in x 
and y, we can substitute r for x. Making these substitutions yields, 



B x 




(9.19) 



which tells us the magnetic field along the z direction due to the magnetic dipole. For 
Faraday's law, the magnetic field component along the r direction, B r , does not matter 
because this is parallel to the area of the copper loop, and so B r ■ A = 0. 

Now, to calculate Faraday's law using Equation 9.7 we must take the dot product of 
the magnetic field, B z , with the area, A, through which the magnetic field passes. Unlike 
the previous example for Faraday's law, the magnetic field is not constant with respect 
to r, so we must integrate over the magnetic field. 



B 7 -A 




2nr dr 



Mo/^o 



2(z 2 + r^)2 



(9.20) 



Taking the derivative of this with respect to time is a bit tricky, because there is no time 
dependence! To get around this we must apply the chain rule, and first differentiate the 
magnetic field with respect to z. 



V 



I d(B z ■ A) 



dt 



d(B z ■ A) dz 



dz 



dt 



fiofir^z dz 



2(z 2 +r 2 ) ) 2 



dt 



(9.21) 



Something quite interesting just occurred. We now have a dz/dt term, which is just the 
velocity at which the magnet is falling away from the current loop, v = dz/dtl 
The induced current within the copper loop, I, can be found using Ohm's law 14 , 



V 

R 



vfioHr^z 
2R(z 2 + r 2 )l 



(9.22) 



assuming the copper loop has a resistance R. We could use the Biot-Savart law to 
calculate the induced magnetic field from this current loop and then determine the force 
exerted on the falling magnet, but it is simpler to use the Lorentz force and calculate the 
force exerted on the flowing charge in the copper loop by the falling magnet. Because 
there must always be an equal and opposite force, the same force must be applied to the 
magnet, but in the opposite direction. 



4 If Ohm's law is a little hazy go back and take a look at Equation 8.1 in Chapter 8. 
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The definition of current, given in Chapter 8, is the amount of charge passing per unit 
time. If the current is multiplied by the length of the loop through which it is circulating, 
this can be multiplied by the magnetic field from the falling magnet to find the qv x B 
force term of Equation 9.1. 15 If we calculate the force exerted on the loop from the B z 
component of the magnetic field, the direction of the force is pointing radially outward, 
and will not counteract gravity. However, if we calculate the Lorentz force from the 
radial B r component of the magnetic field, the direction will counteract gravity. The 
radial component from a magnetic dipole is 16 , 

B r = 3 ^ Z . (9.23) 

Att(z 2 + r 2 )2 

and so the force exerted on the magnet is, 

FB=2 „ oBrI = ^f J ^L ?v (9 ,4) 

where B r was evaluated at the radius of the copper loop, r$. We now know the force 
exerted on the falling magnet, given the distance z which the magnet is from the copper 
loop, and the velocity v at which the magnet is falling. 

Let us now return to the more complicated idea of the magnet falling through a 
copper pipe. First, assume that the eddy currents within the pipe caused by the falling 
magnet can be approximated by an eddy current flowing directly above the magnet and 
another eddy current flowing in the opposite direction, directly below the magnet. As 
the magnet falls, these eddy current follow above and below the magnet. Next, assume 
that the current in these loops is described reasonably well by Equation 9.22. In this 
case we don't know z, the distance of the eddy currents from the magnet, but we can 
write z in terms of 7*o, 

z = Cr (9.25) 

where C is some unknown constant. Additionally, if we assume that the magnet has 
reached its terminal velocity, than v is constant and no longer dependent upon z. Plug- 
ging this into Equation 9.24 we obtain, 



'm{i + c 2 frl 



Fb = 2 _;r°^ R 4 ^ (9-26) 



which is just in terms of the resistance of the copper tube, the radius of the copper 
tube, the terminal velocity of the magnet, and the dipole moment of the magnet. Notice 
the factor of two in front of Equation 9.26. This is because two eddy currents are 
contributing to the force on the magnet. 

Because the magnet is at terminal velocity it is no longer accelerating, and so the 
forces acting on it must balance. The gravitational force on the magnet is just mg, 



15 This step might require a bit of thought. 
This was not derived, but the exa 
can be used, but now for Ei lpole . 



16 This was not derived, but the exact same method used in the previous section to determine _B^ lpole 
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where m is mass of the magnet, and g is the acceleration due to gravity. Setting the 
gravitational force equal to the magnetic force gives, 

mg — ' 



4R(1 + C 2 ) 5 rf. 

nl a (9-27) 

_ 2mgR(l + C 2 ) 5 r$ 

where in the second line v has been solved for. It turns out from experiment that 
C ~ 1.37, which corresponds to an eddy current one third the maximum possible current 
of Equation 9.22. Physically, this means that we can think of the eddy currents being 
created above and below the magnet at a distance of 1.37ro, or 1.37 times the radius 
of the pipe. Now, with Equation 9.27, we have a velocity that we can verify with 
experiment. 
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10 Waves 



Waves are a critical part of everyday life, but are oftentimes overlooked. Without waves 
we could not see or hear; television, radio, internet, and cell phones all would not be 
possible without an extensive understanding of waves and how they behave. Fully un- 
derstanding waves is not a simple task, but a large body of literature is available to 
those eager to learn more. A good introduction to wave phenomena can be found in 
Vibrations and Waves by A. P. French, while a more detailed analysis is given in Elec- 
tromagnetic Vibrations, Waves, and Radiation by Bekefi and Barrett. A large body of 
course notes and examples is available through the MIT OpenCourseWare materials for 
8.03 as taught by Walter Lewin. 

10.1 Types of Waves 

But what exactly is a wave? The following definition is given by the Oxford English 
Dictionary. 

"Each of those rhythmic alternations of disturbance and recovery of configuration 
in successively contiguous portions of a fluid or solid mass, by which a state of 
motion travels in some direction without corresponding progressive movement of 
the particles successively affected. Examples are the waves in the surface of water 
(sense 1), the waves of the air which convey sound, and the (hypothetical) waves of 
the ether which are concerned in the transmission of light, heat, and electricity." 

This definition is hardly satisfying and rather long-winded, but illustrates some of the 
difficulties of rigorously defining exactly what a wave is. Perhaps a simpler definition 
would be "a periodic variation traveling within a medium". Of course this simpler 
definition is somewhat inadequate as well; an electromagnetic wave such as a radio 
wave does not need to travel through any medium, yet it is still a wave 1 . Despite the 
difficulty in defining a wave, most people have a decent intuition for waves from everyday 
interactions with everything from water waves to sound and light waves. 

There exist two general types of waves, transverse and longitudinal. In a transverse 
wave the "periodic variation" is perpendicular to the direction of propagation of the 
wave as shown in Figure 10.1(a). A classic example of this is shaking the end of a 
rope. The wave produced by the shaking of the rope travels horizontally away from the 
experimenter along the rope while the actual displacement of the rope is vertical, either 
up or down. Similarly, when a guitar string is plucked the string is displaced vertically, 
but the wave travels horizontally. Other common types of transverse waves are light 
waves (as are all electromagnetic waves) and water waves. 



X A consequence of the wave/particle duality of electromagnetic radiation. 
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(a) Transverse Wave 
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Figure 10.1: The two different types of waves. 



100 



10.2 Properties of Waves 



A (wavelength) 
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Figure 10.2: The fundamental properties used to describe a wave. 

For a longitudinal wave, shown in Figure 10.1(b) the periodic variation is parallel to the 
direction of propagation. Perhaps the most commonly experienced longitudinal waves 
are sound waves. Vibrations from an object, whether an instrument or voice, compress 
the surrounding air and create a pocket of high pressure. This pocket of compressed air 
travels in the direction of the sound wave until it impacts another object, such as the 
human ear. 



10.2 Properties of Waves 

While the shape of waves can vary greatly, three fundamental properties can be used to 
describe all waves. These three properties are wavelength, velocity, and amplitude. 
Further quantities derived from the three quantities above but also useful to describe 
waves are frequency, angular frequency, and period. Of the two types of waves, transverse 
and longitudinal, it is oftentimes simpler to visualize transverse waves and so here, the 
properties of waves are described using a transverse wave in the form of a sine wave in 
Figure 10.2. The exact same properties describe longitudinal waves and other types of 
transverse waves as well, but are not as simple to depict. 

The wavelength for a wave is usually denoted by the Greek letter A (spelled lambda) 
and is given in units of distance, traditionally meters. In Figure 10.2 the wavelength is 
defined as the distance between identical points on two consecutive oscillations of the 
wave. Here, the wavelength is measured from the crest of the first oscillation to the 
crest of the second oscillation. It is also possible to measure the wavelength from the 
equilibrium of the wave, but this leads to the common problem of accidentally measuring 
only half a wavelength. The wavelength of visible light ranges from 400 nm (violet) up 
to about 600 nm (red). Audible sound waves have wavelengths ranging from around 20 
m (very low) to 2 cm (very high). 

The velocity of a wave, v, is a vector, and consequently has an associated direction 
with units of distance over time. The velocity for a wave can be found by picking a 
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fixed point on the wave and measuring the distance traveled by the point over a certain 
time. The speed of light within a vacuum is around 3.0 x 10 8 meters per second, while 
the speed of sound at standard temperature and pressure is only 340 meters per second. 
The speed of sound is highly dependent upon the density of the medium through which 
the sound wave is traveling and is given by, 




(10.1) 

where C is the elasticity of the medium, and p the density. As can be seen, the lower 
the density of the medium, the faster the velocity of the sound wave (echos from the top 
of Mount Kilimanjaro return faster than echos here in Dublin). 

The frequency for a wave is obtained by dividing the velocity of the wave by the 
wavelength of the wave. 

/ = ^ (10.2) 

Frequency is usually denoted by the letter / but is sometimes denoted using the Greek 
letter v (spelled nu). With dimensional analysis and the equation above it is possible 
to see that the units of frequency are one over time. A more intuitive explanation of 
frequency is to imagine a wave passing in front of an observer. The frequency is the 
number of oscillations that pass in front of the observer per unit time. Traditionally, 
frequency is given in Hertz (Hz) or oscillations per second. Using the velocities and 
wavelengths given above for both light and sound, the frequency of visible light is between 
the range of 7.5 x 10 14 Hz (violet light) and 5 x 10 14 Hz (red light) while the frequency 
of audible sound is between 20 and 20, 000 Hz. 

Another method of expressing frequency is angular frequency or angular velocity, 
given by the Greek letter u, and introduced earlier in Chapter 5. As implied by the name, 
angular frequency is oftentimes used for problems involving rotational motion, and is 
just the velocity in radians per second at which an object is rotating. It is possible to 
write angular frequency in terms of normal frequency, 

u = 2tt/ (10.3) 

with / given in Hz and u in radians per second. 

The period of a wave is the time it takes one oscillation to pass an observer, or just 
the inverse of the frequency. Period is denoted by the letter T and given in units of time. 

T=- (10.4) 

The amplitude of a wave, as shown in Figure 10.2, is denoted by the letter A and 
is a measure of the distance of the peak of an oscillation to the equilibrium of the 
wave. The unit for amplitude is generally distance, but depends on the type of wave. 
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Sometimes waves are not symmetric like the wave shown in Figure 10.2, and so the type 
of amplitude defined above is renamed peak amplitude and a new amplitude called 
root mean square (RMS) amplitude is used instead. The RMS amplitude for a 
wave is found first by squaring the distance of the wave from equilibrium. The mean of 
this squared value is found, and the square root taken to return the RMS amplitude. 
More mathematically, 



fx 2 dt , , 

ARMS = f~J^ (10 ' 5) 

where x is the distance of the wave at time t from equilibrium. The RMS amplitude 
is important for sound and light waves because the average power of the wave is pro- 
portional to the square of the RMS amplitude. For example, a very bright light source, 
such as a halogen bulb, emits light waves with very large RMS amplitudes. 

10.3 Standing Waves 

One of the most interesting and important phenomena of waves is the standing wave 
which is what makes music possible. In a standing wave, places along the vibrating 
medium (take for example a guitar string) do not move. These non-moving points are 
called nodes. Using the guitar string example, it is possible to see that by necessity there 
must be at least two nodes, one for each fixed end of the string. When there are only 
nodes at the end points of the string, no nodes in the middle, the sound emitted is called 
the fundamental or first harmonic. The standing wave produced when the frequency 
of the wave is increased sufficiently to produce a node in the middle of the guitar string 
is called the first overtone or second harmonic. Whenever the frequency of the wave 
is increased enough to produce another node, the next harmonic or overtone is reached. 
Figure 10.3 gives the first four harmonics of a guitar string. 2 

Looking again at the example above we can derive a relationship for the fundamental 
frequency of a vibrating string. To begin, we notice that the wavelength is just twice 
the length of the string and so we can write frequency in terms of velocity of the wave 
on the string, v, and length of the string, L. 

/-£ (10-6) 

Unfortunately we do not usually know the velocity at which a wave travels through 
a string. However, we can find the velocity by examining the force diagram for a in- 
finitely small portion of the string. The next portion of this chapter becomes a little 
mathematically involved, but the end result is beautiful. 

For a wave traveling in two dimensions , x and y, along the rc-axis, we can write that 
the second derivative of y with respect to x is equal to some coefficient C times the 



Here we use a sine wave for simplicity, but technically this can be any wave that satisfies the wave 
equation. 
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second harmonic 



fourth harmonic 



Figure 10.3: The first four harmonics for a standing wave fixed at both ends. 



acceleration of the wave in the y direction. 

d 2 y =c d2 V 
dx 2 dt 2 



;io.7) 



This comes directly from the two-dimensional case of the wave equation. 3 If we substitute 
in some function for our wave, in this case, 



y = Asin(x — vt) 
where v is the velocity of the wave, we can find a value for C . 



(10.* 



Ox 



., — Asin(vt — x), -^2 = v Asin(vt — x) 



dt 2 



Asm.{vt — x) = Cv Asin(vt — x) => C 



;io.9) 



Now we use a force diagram to obtain a relationship similar to Equation 10.7 but for 
this particular situation. The force diagram for an infinitesimally small portion of the 
string is shown in Figure 10.4. We begin by finding the net force in the y direction. 



F y = T\ sin 0i — T2 sin 02 
We can now use Newton's second law to equate this to ma, 

d 2 y 



T\ sin 0\ — T2 sin 02 = fxAx 



dt 2 



;io.io) 



10.11) 



where fiAx is the mass of the piece of string (density \i times distance), and d 2 y/dt 2 the 
acceleration of the string along the y direction. 



If you want to read more about the wave equation, consult the resources given at the start of this 
chapter. 
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Figure 10.4: Force diagram for an infinitesimaHy small portion of a string vibrating at 
its fundamental frequency. 

Because the string is not moving in the x direction and because the string is not 
deforming much from equilibrium, we can assume that the net force in the x direction 
is and in either direction is approximately T, the tension on the entire string. 



T\ cos 9i — T2 cos 62 « T — T = 



10.12) 



Dividing Equation 10.11 by T and using T ~ Ti^ cos #1,2 from above we obtain the 
following. 



Tisinfli T 2 sin 6 2 Tisin^i T 2 sin6>2 



T 



T\ cos 6\ T2 cos 62 



tan 9\ — tan 62 



[iAx d 2 y 
~1^W 



(10.13) 



As Ax approaches zero by definition tan#i — tan $2 — > dy/dx. Dividing both sides by 
Ax then gives, 



Oy 



d 2 y \x d 2 y 



Axdx dx 2 T dt 2 
which looks almost exactly like Equation 10.7! Setting, 



1 



a 

T 



(10.14) 



(10.15) 



we can solve for v in terms of /z, the linear density of the string, and T, the tension of 
the string. 



10.16) 



Placing this value for v back into Equation 10.6, we can now write the fundamental 
frequency for a string in terms of length of the string L, density of the string /j, and 
tension of the string T. 



1 T 
2LVm 



10.17) 
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Figure 10.5: The magnetic field lines of the experiment. The blue circles represent wires 
running perpendicular to the plane of the page. The wire with a x symbol 
in the middle has current running into the page, whereas the wire with • 
symbol has current running out of the page. The black boxes represent the 
poles of the horse-shoe magnet. 

Intuitively this equation makes sense. Shorter strings produce higher frequencies; a bass 
produces very low notes, while a violin produces very high notes. Similarly, very taught 
strings, like those on a violin produce higher sounds than loose strings. Anyone who has 
tuned a string instrument knows that by tightening the string the instrument becomes 
sharp, and by loosening the string, the instrument becomes flat. Finally, the denser the 
string, the lower sound. This is the reason most stringed instruments do not use gold or 
lead strings, but rather tin, steel, or plastic. 

10.4 Experiment 

Despite some of the mathematical intricacies of the argument above we have the end 
result that the fundamental frequency for a vibrating string is given by, 




(10.18) 

where /i is the linear density of the string, and T the tension on the string. The equation 
above can be rewritten as, 

'-^"(D* -""(£)' (1019) 

which matches the general formula given in the lab manual when k = 1/2, n = — 1, and 
r = 1/2. 

The goal of this experiment is to verify the theory above by determining values for n 
and r experimentally. This is done by oscillating a string and varying the length and 
tension of the string to determine values for n and r respectively. 
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10.4 Experiment 

The experimental apparatus consists of a metal wire (acting as a our string) through 
which an alternating current is driven. The changing current causes the magnetic field 
lines surrounding the wire (due to moving charge) to flip between clockwise and counter- 
clockwise as shown in Figure 10.5. The horse-shoe magnetic provides a constant magnetic 
field so that the string is pulled back and forth, causing the wire to oscillate. The weight 
at the end of the wire provides a known tension on the wire. 

Because it is oftentimes easier to visualize linear relationships, we use a common trick 
in physics and take the natural log of both sides of Equation 10.19. 



In/ = In 



kU' 



10.20) 



Next we use some of the algebraic properties of logarithms, 
In (ab) = In a + In b 



In ( — ) = In a — In b 



10.21) 
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to write a linear relationship between In/ and InL, and In/ and lnT. 



In / = n In L + In 
In / = rlnT + In 
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10.22) 



From here we can determine values for n when we vary L and measure /, and similarly 
we can find r when we vary T and measure /. Finally, if we assume that k = 1/2 we 
can determine the linear density of the resonating wire, [x. 
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Optics is a subject oftentimes used to study other phenomena in physics such as quantum 
mechanics or general relativity, but is rarely studied as its own subject. Unfortunately, 
this means that many physicists have learned optics as a patchwork of examples rather 
than a complete field. This chapter attempts to give a brief cohesive introduction into ge- 
ometrical optics, but only touches the tip of the iceberg. Fermat's principle is introduced 
to explain the phenomena of reflection and refraction (and subsequently diffraction) in 
the first section and these principles are applied to thin lenses in the second section. A 
slightly more advanced look at geometrical optics using matrix methods is introduced 
in the fourth section, and these methods are then applied to the thick lens in the fifth 
section to derive the lensmaker's formula. Finally, the Fresnel equations are discussed 
in the sixth section. 

To begin, we must first differentiate between geometrical optics and physical op- 
tics. In geometrical optics the wavelength of the light passing through the optical setup 
is much smaller than the size of the apparatus. For example, visible light passing through 
a pair of glasses would be accurately modeled by geometrical optics as the wavelength of 
the light is « 5 x I0 -5 cm while the size of the glasses is on the order of 1 cm. Physical 
optics models light when the wavelength of the light is of the same order as the optical 
apparatus. One can think of geometric optics as the limit of physical optics for A<Ci 
where x is the size of the optical apparatus. Additionally, because light is treated as 
rays, geometrical optics cannot account for the polarization and interference of light. 

In geometrical optics light can be translated, refracted, diffracted, and reflected. 
The translation of light is just light traveling in a straight line through some medium 
such as glass or water. The refraction of light is when light passes from one medium 
to another medium, where the index of refraction for the first medium is different from 
the second. When this process occurs, the light is bent by an angle which is governed 
by Snell's law. The diffraction of light occurs when light is separated by wavelength. 
Diffraction occurs in a prism through the process of refraction, and so diffraction will 
not be discussed further in this chapter. The final phenomena, reflection, is when light 
bounces off an object governed by the law of reflection. 



11.1 Fermat's Principle 

All three principle actions that can be performed on light, translation, reflection, and 
refraction can be derived from Fermat's principle. This principle states that light 
is always in a hurry; it tries to get from point A to point B in the shortest amount 
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Figure 11.1: Diagrams used to prove both the law of reflection, Figure 11.1(a), and Snell's 
law, 11.1(b), using Fermat's principle. 



of time. 1 The first consequence of this principle is that light translates or travels in a 
straight line. Ironically, while the idea that a straight line is the fastest route between 
two points seems painfully obvious, the actual mathematics behind it are non-trivial. 2 
Proving the laws behind reflection and refraction, however, are much simpler. 

Consider the diagram of Figure 11.1(a) where a light ray begins at point A, hits a 
surface at point R at distance x from A, and reflects to point B at distance d from A. 
We already know that the light must travel in a straight line, but what is the fastest 
path between A and Bl The first step is to write out the total time the path of the light 
takes. This is just the distance traveled divided by the speed of the light as it travels 
through a medium with refractive index no where, 



n 



c 

v 



(11.1) 



or in words, the refractive index is the speed of light in a vacuum divided by the speed 
of light in the medium through which it is traveling. 

The total time of the path is then just t = uqL/c where L is the path length. Using 
the Pythagorean theorem the path length is, 



L = \Ja 2 + x 2 + yjb 2 + (d-x) 2 



(11.2) 



where a is the distance of point A above the surface, and b the distance of B above the 
surface. Now we wish to find the value for x that provides the minimum total time for 



1 Alternatively, because light must always go the same speed in a specific medium, we can think of light 
trying to take the shortest possible path, and subsequently light is not in a hurry, but just lazy. On 
another note, this statement of Fermat's principle is the original form, but also is not quite correct 
mathematically speaking; sometimes light will take the longest path. 
For those readers who are interested in learning more, try reading Geodesies: Analytical and Numerical 
Solutions by Coblenz, Ilten, and Mooney. 
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the path. First we take the derivative of the path time t with respect to the distance x. 
dt no ( x d — x \ 



dx c yVa 2 + x 2 sjb 2 + (d- x) 2 



(11.3) 



This is just slope of t(x), which when equal to zero yields either a local minimum or 
maximum of the function t(x). 3 

Setting Equation 11.3 to zero yields an extremum for the path time t. Without looking 
at a graph or taking the second derivative of t(x) it is impossible to tell if this extremum 
is a maximum or a minimum, but to save time, let us proceed with the knowledge that 
this extremum is a minima. This means that the path time is minimized when the 
following relation is true. 

. X = , d ~ X (11.4) 

Va 2 + x 2 y/b 2 + (d- x) 2 

But the left side of the equation is the angle of incidence, sin#o> and the right side of 
the equation is the angle of reflection, sin#i. Note that for both reflection and refraction 
the angle of incidence and reflection is defined as the angle between the light ray and 
the normal to the surface of the reflecting or refracting object. The normal to a surface 
is the line perpendicular to the surface at that point; for example, the normal to the 
surface of a sphere is always the radius. Plugging sin^o and sin#i back into Equation 
11.4 just gives us the law of reflection, 

#o = 01 (H.5) 

which tells us the incident angle is equal to the reflected angle! 

For refraction we can proceed with the exact same procedure that we used for reflec- 
tion, but now the velocity of the light from point A to R, shown in Figure 11.1(b), is 
different from the velocity of the light traveling from point RtoB. This yields the total 
time traveled as, 

c c 

where no is the index of refraction for the first medium and n\ is the index of refraction 
for the second medium. Finding the minima using the method above results in, 

np x _ ni d-x 

c ^/a 2 + x 2 c yV + (rf _ x )2 

which can be further reduced to Snell's law, 

no sin 6q = n i sin 9\ (11.8) 

which dictates how light is refracted! Another way to think of Snell's law is a lifeguard 
trying to save someone drowning in the ocean. For the lifeguard to get to the victim as 
quickly as possible they will first run along the beach (low index of refraction) and then 
swim to the victim (high index of refraction) . 



t = ^ Va 2 + x 2 + — ^Jb 2 + (d- x) 2 (11.6) 



3 We know we have reached the bottom of a valley or the top of a hill when the ground is no longer at 
a slope, but is flat. 
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11.2 Thin Lenses 

Now that we know how light translates, reflects, and refracts, we can apply ray dia- 
grams to thin lenses. A thin lens is a spherical lens where the thickness of the lens 
is much less than the focal length of the lens. Before the definition of focal length 
is given, let us consider the lens of Figure 11.2(a). We can draw light rays from some 
object, given by the arrow in the diagram, up to the lens as straight lines. However, 
once they pass through the lens they must bend because of refraction. 

It turns out that every light ray parallel to the optical axis, the dotted line passing 
through the center of the lens, converges to a single focal point after passing through 
the lens. We will not prove this now, but by using the matrix method of the following 
section it can be seen that focal points must exist for spherical lenses. The focal length 
of a lens is just the distance from the center of the lens to the focal point and is given 
by the thin lens approximation, 

1 m — nn ( 1 1 , 

' l (11.9) 



where R± is the radius of curvature for the front of the lens, R2 the radius of curvature 
for the back of the lens, no the index of refraction for the medium surrounding the lens, 
and n\ the index of refraction for the lens. 

A ray diagram then is a geometrical method to determine how light rays will propagate 
through an optical set up. An optical system can be fully visualized by drawing the two 
following rays. 

1. A ray from the top of the object, parallel to the optical axis, is drawn to the 
centerline of the lens, and then refracted by the lens to the focal point. 

2. A ray from the top of the object is drawn through the center of lens and passes 
through unrefr acted. 

In Figure 11.2(a) these two rays are traced for a convex lens with the object outside 
of the focal length of the lens. 

This brings about an important point on notation. The radii of curvature and focal 
lengths for lenses have signs associated with them. Oftentimes different conventions for 
the signs are used between sources. For the purposes of this chapter the radius for a 
sphere starts at the surface and proceeds to the center of the sphere. The focal length 
begins at the center of the lens and proceeds to the focal point. Positive values are 
assigned to focal lengths or radii moving from left to right, while negative values are 
assigned for the opposite movement. A convex lens has R\ > and R2 < while a 
concave lens has R± < and R2 > 0. 

The diagrams of Figure 11.2 demonstrate the four basic configurations possible with 
thin lenses: object outside focal length of convex lens, object inside focal length of 
convex lens, object outside focal length of concave lens, and object inside focal length of 
concave lens. In the diagrams, a dotted arrow indicates a virtual image while a solid 
arrow indicates a real image or object. A real image is an image that can be projected 
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(a) 




(b) 





Figure 11.2: Ray tracings for convex and concave thin lenses. Figures 11.2(a) and 11.2(b) 
give diagrams for a convex lens with the object outside and inside the focal 
length. Figures 11.2(c) and 11.2(c) give diagrams for a concave lens with 
the object outside and inside the focal length as well. 
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onto a surface, while a virtual image cannot. Notice the only optical configuration that 
produces a real image is with an object outside the focal length of a convex lens. 

11.3 The Matrix Method 

But where does the thin lens approximation of Equation 11.9 come from? How do 
we know the assumptions of the previous section are true? One method for deriving 
Equation 11.9 involves a large amount of rather tedious geometry along with the use 
of Snell's law. Unfortunately, this derivation is just more of the ray tracing diagrams 
above, and cannot be easily adapted to other optical systems, for example a system 
with a concave lens followed by a convex lens. What happens if we want a method for 
determining the focal length for any optical system? Such a method does exist and is 
called the matrix method, but as its name implies can be mathematically challenging 
at times for readers unfamiliar with matrix operations. 4 

The idea behind the matrix method is to split any optical system into building blocks 
of the three basic actions: translation, refraction, and reflection. Putting these three 
actions together in an order dictated by the set up allows us to build any optical system 

Let us consider the example of a lens. To begin, an incident light ray is diffracted 
when entering the front of the lens, and so the first action is diffraction. Next the ray 
must pass through the lens, and so the second action is translation. Finally, when the 
ray exits the lens it is diffracted again, and so the third action is diffraction. If we 
placed a mirror some distance behind the lens, then the light would translate to the 
mirror, reflect off the mirror, and then translate back to the lens followed by diffraction, 
translation, and diffraction from the lens. 

Now that we understand how to use these optical building blocks to create optical 
systems, we must define the blocks more mathematically using linear algebra. 5 The idea 
is to describe a single light ray as a vector and an optical building block as a matrix that 
modifies that vector. Luckily, a light ray in two dimensions can be fully described by its 
angle to the optical axis, 9, and its distance above the optical axis, y. This means that a 
light ray can be fully described by a vector of two components, (9,y), and subsequently 
each optical building block is represented by a 2 x 2 matrix. 

Using Figure 11.3(a), we can determine the matrix that represents the translation 
action for a light ray. First we write out a general equation, 



<?1 



t 3 r 4 






;n.io) 



which states that the outgoing ray represented by the vector ($i,yi) is equal to the 
incoming light ray, (9o,yo), multiplied by the translation matrix T. Performing matrix 



For those who do not know how to use matrices, I suggest just quickly reading up on basic matrix 
operations because the method outlined below is well worth the work to understand it. All that is 
needed is a basic understanding of how to multiply 2x2 matrices along with creating 2x2 matrices 
from a system of linear equations. 
5 Matrices are oftentimes used to solve linear systems of equations algebraically, hence linear algebra 
refers to the branch of mathematics dealing with matrices. 
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(a) Translation 




R 



(b) Refraction 




R 



(c) Reflection 



Figure 11.3: Diagrams of the three possible actions performed on a light ray which can 
be represented as matrices. 
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multiplication on Equation 11.10 yields, 

y (11.11) 

2/0 = T 3 o + Tkyo 

but by using the diagram of Figure 11.3(a), we can also write the following system of 
equations. 

0i = 0n 

° (11.12) 

y = x tan 9 + y 

Unfortunately, Equation 11.12 is not linear, it has a tangent of 6q\ To avoid this 
problem we make the assumption that all light rays passing through the optical system 
are paraxial; the angle 9 between the ray and the optical axis is very small and so the 
small angle approximations sin0 ~ 9, cos 9 ~ 1, and tan0 ~ 9 hold. Using the small 
angle approximation on Equation 11.12 yields, 

9l = $ ° (11.13) 

yo = x9 + y 

which when combined with Equation 11.11 allows us to determine the matrix elements 
of T. 



1 
x 1 



;n.i4) 



With the matrix for translation, we can move on to the matrices for refraction and 
reflection. The method for determining these matrices is exactly the same as for the 
translation matrix, but unlike the translation matrix, these matrices are dependent upon 
the geometry of the optical component being modeled for either diffraction or reflection. 
Spherical lenses and mirrors are the most common optical components used, and so we 
will find the refraction and reflection matrices for spherical components. 

The diagram of Figure 11.3(b), shows a light ray refracting through a spherical lens 
with radius R. By geometry and the small angle approximation we know, 

«o = #o + <t> 
cui = 9i + 4> 

. . Vo v . 2/o (11.15) 

sm ^ = ^ -> 0-ft 

yo = y\ 

while by Snell's law (and again the small angle approximation) we know, 

no sin ao = ni sin ai — > noao ~ n\a\ (11.16) 
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where uq is the index of refraction outside the lens, and ri\ is the index of refraction 
within the lens. Plugging in Equation 11.15 into Equation 11.16 yields, 
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with a little algebraic manipulation. Using the relation for 9\ above and yo 
elements of the refraction matrix R can be found and are given below. 



(11.17) 
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Finally, we need to determine the elements for the reflection matrix, M, of a spherical 
mirror with radius —R, shown in the diagram of Figure 11.3(c). First we know that 
yo = 2/i- Next by the law of reflection we see that the light ray must reflect with angle 
a about the radius perpendicular to the point of reflection. By requiring the angles of 
triangles to sum to n, we arrive at, 



<f> — a 

<h + a 



;n.i9) 



into which <p « yo/R (using the small angle approximation and geometry) can be sub- 
stituted. This results in, 



, 2 
R V0 



11.20) 



which, along with y\ = t/o an d a little manipulation, can again be used to determine the 
elements of the reflection matrix given below in Equation 11.21. 



M 
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R 
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:n.2i) 



11.4 Lensmaker's Equation 

Finding the three matrices for translation, refraction, and reflection above is rather 
involved, and the advantage of using the matrix method may not be readily apparent, 
so let us apply it to a thick lens, as shown in Figure 11.4. However, before we can dive 
into the matrix method, we must define the cardinal points of the thick lens. The 
focal points Fq and F\ are defined as points through which light rays parallel to the 
optical axis will pass after refracting through the lens. The principal points Pq and 
P\ are the points where the parallel rays intersect the rays from the focal points when 
neglecting refraction. Finally, the nodal points iVo and N\ are the points through 
which a ray enters the lens, refracts, and exits on a parallel trajectory. The nodal points 
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Figure 11.4: Diagram of a thick lens with the cardinal points, excluding nodal points iVo 
and N±. Here the medium surrounding the lens has index of refraction no 
while the lens has index of refraction ri\. 



are equivalent to the center of a thin lens, and are not shown in Figure 11.4 because 
they are not relevant for the discussion below. 

Consider the light ray leaving the focal point Fq and exiting the thick lens parallel to 
the optical axis. The incoming ray has some initial angle, 9q, and some initial height, 
2/o- After exiting the lens, the ray has some final angle 6\ and some final height y±. For 
this specific scenario we know the final angle must be by the definition of the focal 
point. Just like with Equation 11.10 we can write, 
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where S is the transforming matrix for the optical system, or in this case, the thick lens. 

We don't know what S is, but by looking at the diagram and using the translation, 
refraction, and reflection matrices, we can build S. First, the ray refracts, and so we 
must multiply the initial ray vector by the refraction matrix. Next, the ray is translated 
over a distance x and so we must multiply the initial ray vector and refraction matrix 
with a translation matrix. Finally, the ray refracts again, and so we must multiply the 
previous matrices and ray vector with another refraction matrix. 
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11.23) 



It is important to note that the order matters. In Equation 11.23 the R farthest to the 
right represents the first refraction, while the R on the left is the second refraction. 

Now, we can use Equations 11.14 and 11.18 for T and R to explicitly calculate out S. 
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Returning to Equation 11.22 we can write out relations for 61 and j/i, 

-Si#o 



6»i = Si0 o + S22/0 
2/i = S^o + ^yo 
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where in the first relation we have utilized the fact that the outgoing ray is parallel to 
the optical axis, i.e. &i = 0. Looking back at Figure 11.4 we see from the geometry that, 
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where the small angle approximation was used in the second step. Plugging in Equation 
11.25 into Equation 11.26 yields, 
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which gives the focal point /o, in terms of the elements of S. The algebra in the second 
to last step is found by plugging in values for the elements of S in the numerator using 
Equation 11.24 and is rather tedious, but the result Si £4 — S3S2 = 1 is well worth it. 

If we take the final result of Equation 11.27 and plug in S2 from Equation 11.24, we 
obtain, 
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which is known as the lensmaker's equation. As x becomes very small, the lens 
approaches the thin lens approximation and Equation 11.28 becomes, 

1 =L _^/l IN 

/o n V^2 RiJ 

which is the thin lens approximation for focal length given in Equation 11.9! 

11.5 Fresnel Equations 

The previous sections have exclusively dealt with geometrical optics, which while im- 
portant, cannot describe all light phenomena, specifically polarization. To understand 
polarization we must leave the realm of geometrical optics and enter the world of phys- 
ical optics. The first thing we must do is understand the details of what exactly light 
is. Light can be thought of as the combination of two fields, an electric field oscillating 
up and down, and a magnetic field, perpendicular to the electric field, also oscillating. 
The cross product of the electric field with the magnetic field, E x B, is always in the 
direction that the light wave is traveling. 

If an observer were to see a light wave directly approaching them, they would see 
the electric field as a vector, always pointing in the same direction, but growing and 
shrinking. Similarly, they would see another vector representing the magnetic field, 
perpendicular to the electric field vector, also growing and shrinking as the light wave 
approached. The vectors that the observer sees are two dimensional, and so like any two 
dimensional vector, they can be broken into x and y components. 

So what happens if the observer is in jail and the light must pass through parallel 
vertical jail bars that block electric fields? Any bit of the electric field that is not 
vertical will bounce off the bars, while any part of the electric field that is vertical will 
pass through. In other words, the x component of the electric field will not survive, but 
the y component will. Additionally, if the the electric field is blocked, then so is the 
associated magnetic field, and so the y component of the magnetic field associated with 
the x component of the electric field will be blocked as well. Now the observer sees the 
electric field vector oscillating vertically up and down, while the magnetic field vector is 
oscillating left and right. 

The thought experiment above is the general idea behind plane polarization. Incom- 
ing parallel light waves have a myriad of different electric and magnetic field directions, 
but after passing through a vertical polarizer, only the vertical components of the elec- 
tric field, and the horizontal components of the magnetic field survive. Another type 
of polarization, circular polarization is also possible and is the same idea as plane 
polarized light, but is more difficult to visualize. 

Now what happens if we consider light bouncing off a piece of glass? The light incident 
on the glass can be either vertically or horizontally polarized, again because the vectors 
can be broken into their x and y components. The composite of the two polarizations 
is just normal unpolarized light, but by looking at the two components individually we 
can see what happens to the light reflected off the glass. In Figure 11.5(a) the light is 
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(a) 




polarized so that the E field is coming out of the page and perpendicular to the plane of 
incidence. 6 The light wave is traveling in the direction E x B and so we can then draw 
the direction of the magnetic field. 

When the light waves hits the glass surface, part of the wave is reflected, while part 
of the wave passes through the glass. Because the electric field is parallel to the surface 
of the glass, the electric field will remain pointing in the same direction as the incident 
light wave for both the reflected and transmitted waves. This means that, 



Ei,± + E r 



E, 



t,± 



11.30) 



or that the incident and reflected electric field amplitudes must equal the transmitted 
electric field amplitude. 

Next, we see that the vertical components of the magnetic fields in the incident and 
transmitted waves should are in the same direction, while the horizontal component of 
the magnetic field in the reflected wave has been flipped by the reflection. This gives us, 



.Bj II cos 6i — B r || cos 9 r = B t » cos $t 



'MM) 



which states that the incident horizontal component of the magnetic field less the re- 
flected field must equal the transmitted magnetic field. Using the relation that E = -B, 
Equation 11.31 can be rewritten in terms of the electric field, 



noEi j_ cos 0i — niE r ± cos 8 r = n\E t j_ cos i 



;il.32) 



where no is the index of refraction for the incident and reflected medium, and n± the 
index of refraction for the transmitted medium. Note that the speed of light cancels out 
of the equation because both sides are divided by c. We now have three unknowns and 
only two equations. This is not a problem however, because we only want to know the 
percentage of light reflected back, and so we can eliminate Et j_, let r = Q% by the law 



This notation can be a bit confusing but is the standard. Perpendicular to the plane of incidence 
means that the electric field is parallel with the surface of the reflector if the reflector is a plane. 
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of reflection, and solve for the ratio of the reflected electric field to the incident electric 
field. 



E r ,± _ cos °i - j£ cos °t 



Ei 



cos d i J r rj± COS t 

1 no l 



11.33) 



What happens if we look at the light with magnetic fields parallel to the plane of 
incidence rather than perpendicular as is shown in Figure 11.5(b)? 7 We can repeat the 
exact same process except now all the magnetic fields stay the same direction, 



Bi } ± + B r ± — B t ± 
and the horizontal component of the electric field is flipped by reflection. 



Ei\\ cos 9i — E r \\ cos 9 r = E t \\ cos Ot 



11.34) 



(11.35) 



The substitution for E = -B can be made again and we can obtain the ratio of the 

n ° 

reflected electric field to the incident electric field. 
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Knowing the ratio of reflected to incident electric fields is nice, but it does not tell 
us anything that we can easily measure. However, the intensity of light is just the 
square of the electric field, so if we square Equations 11.33 and 11.36 we can find the 
percentage of light reflected for an incidence angle $i and transmitted angle Ot for both 
perpendicular and parallel polarized light! Taking this one step further, we can eliminate 
Ot with SnelPs law, no sin#j = n\ sin# t . Finally, trigonometric properties can be used to 
reduce the equations into an even more compact form. These equations are known as 
the Fresnel equations. 8 
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Again, to clear up the confusion, this means the electric field is now perpendicular to the surface of 

the reflector. 
Pronounced fray-nell, the s is silent. 
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Figure 11.5: Physical interpretation of the Fresnel equations given in Equations 11.37 
and 11.38 for light transitioning from glass to air and air to glass. 
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So where does all of this math get us? Figure 11.5 plots the percentage light reflected 
for perpendicular and parallel polarized light with respect to the plane of incidence 
for the scenario of light passing from air to glass, uq = l,ni = 1.5, and glass to air, 
riQ = 1.5, n\ = 1. For the scenario of light passing from air to glass, there is a point on 
the plot at ft = 0.98 where no parallel polarized light, only perpendicular polarized light 
is reflected. This angle is called Brewster's angle and is the angle at which the light 
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reflecting off a surface is completely polarized! For the second scenario, light passing 
from glass to air, the light is completely reflected for values greater than 9i = 0.73. This 
is called total internal reflection. By taking into account the polarization of light we 
are able to see two very real light phenomena! 

11.6 Experiment 

The experiment for this chapter explores both geometrical and physical optics and con- 
sists of three parts. In the first part of the experiment, the focal length of a convex lens 
is measured by using the relation, 

111 , 

- + - = - (11.39) 

where X{ is the distance of the image from the lens and x is the distance of the object 
from the lens. This relationship can be derived by tracing an additional ray in Figure 
11.2(a) and equating ratios from the similar triangles formed. 

The second part of the experiment measures the focal length of a concave lens. Because 
concave lenses do not project real images, a concave lens must be placed before the convex 
lens. The focal point for the system is just, 

111 , 

7 = 7 + 7 (H-40) 

J /concave /convex 

where /concave and f convex are the focal points for the concave and convex lenses. The 
focal length for the compound system can be found by using Equation 11.39, and so after 
plugging in the focal length of the convex lens from the first part of the experiment, the 
focal length of the concave lens can be found. 

The final part of the experiment verifies the Fresnel equations by producing a plot 
very similar to Figure 11.5. A laser is polarized using a filter and bounced off a glass 
plate. The reflected laser is directed to a photo-multiplier tube which creates a voltage 
proportional to the intensity of the light. Brewster's angle can then be determined 
from the plots made. It is important in this part of the experiment to not mix up the 
polarizations of the light. The line on the polarizing filter indicates the direction the 
electric field is polarized so for perpendicular polarized light with respect to the plane 
of incidence the line should be vertical. 
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At the beginning of the 19 century a great debate had been raging within the physics 
community for over 100 years, sparked by the diametrically opposing theories of Isaac 
Newton and Christiaan Huygens regarding the nature of light. Newton, in his book 
Optiks 1 , outlined a theory where light was made up of small particles or corpuscles. 
Geometrical optics, as introduced in Chapter 11, is modeled well by rays or straight lines, 
which can be thought of as the paths traced out by individual light particles, and so 
Newton's theory described the optics of the day well. Huygens, however, in his Treatise 
on Light, written in 1690, 14 years earlier than Newton's Optiks, proposed that light was 
not made up of particles, but rather waves, similar to ocean waves. This theory had its 
own merits, but was generally ignored in favor of Newton's particle theory. 

The debate between the theories of Huygens and Newton came to its first 2 resolution 
in 1802 when Thomas Young performed his famous double slit experiment which em- 
phatically demonstrated the wave nature of light. In the experiment, Young set up a 
light source which passed through a cover with two slits. The light then passed from 
these two slits onto a screen, where the pattern of the light could be observed. Accord- 
ing to Newton's theory, the light should project two slits onto the screen. What Young 
found however, was a complicated interference pattern. 

12.1 Interference 

Before diving into the details of Young's experiment, we first need to understand exactly 
what is meant by an interference pattern. Looking back at both Chapters 10 and 11 
we know that a light wave consists of an electric field perpendicular to a magnetic field. 
Because the magnetic field is directly related to the electric field (and vice versa), to 
fully describe a light wave it is only necessary to specify either the electric or magnetic 
field. By convention, the electric field is typically used to describe the wave, and is given 
by the general wave function (in one dimension), 

E = E sin (kx - ut) (12.1) 

where x is the position at which the wave is measured, t is the time at which the wave 
is measured, Eq is the amplitude of the wave, u is the angular frequency of the wave 3 , 



1 He based this publication on his first series of lectures at Trinity College, Cambridge in 1704. 

2 Why this is just the first resolution, and not the final resolution of the debate will be explained in the 

double slit diffraction section of this chapter. 
See the Chapter 10 for more detail; the angular frequency is related to the frequency of a wave by 

U - 2ti-/. 
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and k is the wave number, denned as, 

k = — (12.2) 

A 

where A is wavelength. 

So what happens if we place two light waves on top of each other? In electrodynamics, 
electric fields can be combined by superposition, which just means that the electric 
fields are added together. If the electric field of Equation 12.1 was added to another 
electric field described by the exact same equation (same k and oo), the result would 
be Equation 12.1 but now with an amplitude of 2Eq rather than Eq. This is called 
complete constructive interference, where two fields are added together with the 
same phase. 

But what is the phase of a wave? We can rewrite Equation 12.1 as, 

E = E sin (kx - tot + 5) (12.3) 

where 5 is the phase of the wave. From the equation, we see that the phase of the wave 
just shifts the wave to the right by <5 if the phase is negative, and to the left by 5 if 
the phase is positive. Now, if we add two waves together, one with a field described by 
Equation 12.3 with 8 = 0, and another with 6 = n, the peaks of the first wave match 
with the valleys of the second wave, and so when the two waves are added together, 
the net result is zero! This is called complete destructive interference. There are 
of course combinations between destructive and constructive interference, as shown in 
Figure 12.1(a) where the wave sin(x) is added onto the wave sin(x + 7r/2), but in general, 
noticeable interference is either complete constructive or destructive. 

Now let us take a slightly more mathematical approach to the idea of interference 
between two waves, 

Ei = E sin I kx — cot + - J , E 2 = E sin ( kx - cot ) (12.4) 

where the wavenumber and angular frequencies are the same but with a phase difference 
between the two waves of 8/2 — (—5/2) = 8. We can again just add the two waves 
together to find the total electric field wave, but physically this is not very interesting, 
as the human eye cannot observe the pattern of an electric wave. However, what is of 
interest is the intensity (or brightness) of the wave, which is proportional to the electric 
field squared. Adding the two electric fields of Equation 12.4 together and squaring then 
gives us a quantity proportional to the intensity. 

IxE 2 = (E ± + E 2 f 

= E 2 l +E 2 2 +2E l E 2 

E 2 s\n 2 (kx-cot+-) +££sin 2 ( kx - ojt - - ) ( 12 -5) 



2) u V ^ 

., S\ ( 

+ 2Eq sin ( kx — cot + - ) sin ( kx — uot 
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Figure 12.1: Diagram of interference between two waves is given in Figure 12.1(a) and 
Huygens principle is demonstrated in Figure 12.1(b). 
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The final expanded step of the intensity above looks rather nasty, but luckily we can 
simplify it. We note that it is impossible to look everywhere (i.e. at all x) at the same 
time (i.e. for a specific t). Humans can however, look at a specific point over a period 
of time. Subsequently, we want to look at the final step of Equation 12.5 at a specific 
point over a period of time. This means x becomes a constant, and we need to find the 
time average of the trigonometric functions. Without giving a rigorous mathematical 
argument, the time average, indicated by angled brackets, for sin and cos 2 is just 1/2 
and for sine and cosine is just 0. 

(sin 2 t) = -, (sint) = 0, (cos 2 t) = -, (cost) = (12.6) 

Using the time average of the trigonometric functions, the first two terms in the final 
step of Equation 12.5 become E 2 /2. 

E 2 = E 2 + 2E$ sin (kx-ujt+ - J sm( kx-uot- -J (12.7) 

By using the trigonometric identity, 

cos (6 — 6) + cos (6 + 6) , 

sin 6 sin 6 = i ^ i Zl (12.8) 

the final term can be recast in terms of cosines and the terms dependent upon t can be 
time averaged. 

2E sin I kx — ut -\ — ) sin I kx — ujt ) = E cos (S) + E cos (2kx — 2ut) 



E cos 8 



(12.9) 

Combining this result for the final term with the first two terms of Equation 12.7 results 
in, 

E 2 = \e% + ^E$ + E$cos6 = El (1 + cos 8) (12.10) 

which can be further simplified using the trigonometric identity, 

l + cos<5 = 2cos 2 (-) (12.11) 

for the cosine function. 

£ 2 = 2£^cos 2 ( -J (12.12) 

Putting this all back together, we have found the time averaged intensity at any point 
of two overlapping light waves of the same wavelength, but with a phase separation of 
5. 
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12.2 Double Slit Interference 

We can now check that this mathematical description for the interference between two 
waves makes sense. If the two waves are completely in phase, or 5 = we should have 
complete constructive interference and Equation 12.13 should be at a maximum, which 
we see it is. Similarly, Equation 12.13 is at a minimum of when S = n, or the waves 
have complete destructive interference. 

12.2 Double Slit Interference 

So how do we create an experiment to check Equation 12.13? The first item needed 
is a light source that produces waves that have the same phase and wavelength or a 
coherent and monochromatic light source. Next we need to combine two waves from 
the light source with a known phase difference between the two, and observe the intensity 
pattern. This can be accomplished by taking advantage of what is known as Huygens 
principle, illustrated in Figure 12.1(b). Here, plane waves of wavelength A are incident 
on some surface with a slit. The waves pass through the slit and propagate out in a 
circular manner but still with wavelength A. 

Using this principle, a setup can be made with a coherent monochromatic light source 
emitting plane waves which pass through two slits separated by a distance d, shown in 
Figure 12.2(a). The plane waves then become spherical and propagate outwards from 
the slit, interfering with the wave from the adjacent slit. A screen is placed at a very far 
distance L from the double slit, such that L ^> d. This setup is called a Fraunhofer 
or far-field interference experiment because L is so large in comparison to the distance 
between the slits. 

Consider now looking at a point P on the projection screen at a distance y from 
the centerline between the two slits, forming an angle 9 with the centerline. We can 
trace a line along the spherical waves emitted from each slit to this point, and label 
the distance of the path from the upper slit as r\ and the lower slit as T2- Because the 
screen is so far away, we can approximate r\ and r2 as being parallel, as shown in Figure 
12.2(a). Drawing a line perpendicular to r\ from the upper slit to r2 creates a small 
right triangle, of which we label the base length as Ar. The upper angle of this triangle 
is 6 by geometry, and so Ar is just ds'mO. Because r\ and r<i are approximately parallel, 
we can write r<i as t2 = r\ + Ar. 

The waves emitted from the slits are spherical, but we can see that the profile of the 
wave as it travels along either r\ or t2 is described by the general wave equation of 
Equation 12.1, r replacing x as the position variable. We can then write the equation of 
both electric fields, E\ and E2 at the point where they overlap on the projection screen. 

E\ = Eq sin {kr\ — cut) 

E2 = Eq sin (kr2 — cot) 

= E sin (fc(n + Ar) - ut) (12.14) 

/ 2tt 

= -Eq sin I kr\ — ojt H — —a sin 6 
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In the second to last step, T\ + AR is substituted for r2, and in the final step dsinO is 
substituted for Ar. 

From the first and final line of Equation 12.14, we see that the phase difference between 
the two waves at point P on the projection screen is just kAr. 

S= M^l (12.15) 

A 

If we assume that the angle 9 is small, we can then make the small angle approximation 
between sine and tangent, 

sine « tan 6»= - (12.16) 



and plug this back into the phase difference of Equation 12.15 for sm9. 
2-Kdy 



;i2.i7) 



We now have the phase difference between the two different waves at point P in terms 
of y, the distance y from the centerline of the setup, L, the distance of the projection 
screen from the slits, d, the distance between the two slits, and A, the wavelength of the 
monochromatic coherent light being used. 

Because we have the phase difference between the two waves, and the two waves have 
the same wavelength and frequency, we can use Equation 12.13 to determine the intensity 
pattern on the screen as a function of the distance y. 

I(y)oc2E%cos 2 (^) (12.18) 

As this is just a proportionality, we can absorb all the constant coefficients and replace 
them with some maximal intensity, Iq, to make an equality, 

2 (ndy 



I(y) = I cos z \-±\ (12.19) 

which is plotted in Figure 12.2(b). From this plot we can see that the intensity pattern 
will be at a maximum when, 

W = ^-, ro = 0,1,2,... (12.20) 

where m is the order of the maxima found. 

The intensity pattern of Figure 12.2(b) is an incredible result that allows us to see 
whether light is a wave or particle using a simple experimental setup. However, it turns 
out that in the derivation above we have assumed that a single spherical wave is emitted 
from each slit, which is not a good approximation unless the width of the slits is much 
smaller than the separation between the slits. We will now take this into account, first 
with single slit diffraction and then with double slit diffraction. 
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Figure 12.2: Diagram of the double slit experiment is given in Figure 12.2(a) and the 
intensity pattern is given in Figure 12.2(b). 
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12.3 Single Slit Diffraction 

While reading through this chapter, one may have noticed that the word interference 
has been used to describe the previous two sections. So what then is diffraction? The 
definition for diffraction can be rather tricky, but in general, diffraction is a phenomena 
that occurs from the interference of a continuous set of waves, rather than a discrete 
number of waves, such as two in the double slit interference example above. In the 
following example of a single slit, we must now consider an infinite number of light 
waves interfering, rather than a set of two waves. 

Figure 12.3(a) shows a single slit experiment setup where a coherent monochromatic 
light source passes through a slit of width a and is projected onto a screen at a distance 
L from the slit. Again, we assume that L ^> a, or Fraunhofer diffraction, and that 
we observe the interference pattern at point P a distance y above the centerline of the 
slit. In this experimental setup we must consider adding together an infinite number of 
electric fields, E\ + Ei + E% + • • • + Eqq, each with a slightly different path length r, and 
squaring the result to determine the intensity of the light on the projection screen. Of 
course, adding together an infinite number of electric fields by hand is not fun, and so 
instead we will use an integral. 

First, we notice that Figure 12.3(a) is very similar to Figure 12.2(a); the only differ- 
ence, as mentioned before, is that we now must consider a continuous source of electric 
fields rather than two discrete electric fields. We can write the first infinitesimally small 
contribution to the total electric field as, 

dE= — - sin (kr - ut) (12.21) 

where ro is the distance from the top of the slit to point P. The value dEo is the 
infinitesimal amplitude of the electric field. 

We can move down the slit a distance x and consider an infinitesimal contribution 
being emitted from this point. The path distance, just as in the double slit setup, is now 
increased by an amount Ar from ro. 

dE= °— sin(fc(r + Ar) - ut) (12.22) 

ro + Ar 

Now we can write dEo as the electric field amplitude density, p, times the length over 
which the field is being emitted, dx. Additionally, we can write Ar in terms of a; as a; sin 9. 
Finally, we want to add together all the infinitesimal electric fields, so we integrate from 
x = to x = a, the width of the slit. 

Making these substitutions into Equation 12.22, and integrating yields the following. 

sin (A;ro — ut + kx sin 9) dx 

- ut + kx sin 9) dx ('12 23') 

kasind ) sin (kr -ut+ ^f^) \ 
sin# / 
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Figure 12.3: Diagram of the double slit experiment. 



133 



12 Diffraction and Interference 

In the first step, we have just made the substitutions and set up the integral. Perform- 
ing this integral is very messy, so we make the approximation that p/(ro + xs'mO) is 
approximately p/r^ in the second step. This is a valid approximation because vq 3> Ar. 
Notice that we cannot make the same approximation in the sine term. This is because 
the phase difference of the waves is entirely determined by Ar. In the third and final 
step we perform the definite integral. Now we have the total electric field from the 
infinitesimal contributions along the slit! 

Just as with the double slit experiment, we are not very interested in the electric field 
amplitude at point P but rather the average intensity. To find this, we square the electric 
field given in Equation 12.23, and time average the trigonometric functions dependent 
on time. 

7oc£ 2 = 



12.24) 




" 2ir 2 r 2 t 

For the first step, we have just squared the electric field to find the intensity. In the 
second step we have replaced sin 2 (Zero — ujt + kas'm6/2) with a time averaged value of 
1/2. In the third step we have replaced the wavenumber k with its definition given in 
Equation 12.2. In the final step we have made the small angle approximation of Equation 
12.16. 

The final result of Equation 12.24 is proportional to the intensity of the diffraction 
pattern at y on the projection screen, and so we can absorb all the constants into some 
maximal intensity Iq and equate Equation 12.24 with intensity. 

I(y) = Io-^^ 1 (12.25) 

This intensity pattern is nearly identical to that of the double slit given in Equation 

12.19, but sine squared has been replaced with cosine squared, and the whole quantity 
is divided by y 2 . This makes all the difference in the world, as the shape of the intensity 
pattern for the single slit, shown in Figure 12.3(b) is significantly different from that of 
the double slit. 

Looking at Figure 12.3(b), we see that we can write a relation similar to Equation 

12.20, but instead of locating the maxima, we locate the minima. 

ymin = ^^, m = 0,1,2,... (12.26) 

a 

Again, we have derived a theoretical result which allows for a simple experimental veri- 
fication of the wave nature of light! 
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12.4 Double Slit Diffraction 

12.4 Double Slit Diffraction 

As mentioned earlier, the interference pattern from two discrete slits given in Figure 
12.2(b), is difficult to observe in most experimental slits as oftentimes the slit widths is 
comparable to the slit separation. This means that what was an interference problem 
now becomes a diffraction problem and the exact same method used for the single slit 
can be used but with different limits of integration. The derivation does not introduce 
any new concepts, and drudging through the math is not useful to the discussion here, 
so the results of double slit diffraction will be presented without derivation. 4 
The intensity for double slit diffraction is given by, 



ndy\ / sin 2 (^ 3M 

Jl V 2 



I{y) = I cos^ -M — ¥^ ( 12 -27) 



where a is the width of the two slits, and d is the separation between the the middle 
of the two slits. But wait just one moment, this equation looks very familiar! That's 
because it is; the intensity pattern for double slit diffraction is just the intensity pattern 
for double slit interference, given in Equation 12.19 multiplied by the intensity pattern 
for single slit diffraction, given in Equation 12.25. 

This means we know what Equation 12.27 should look like, just Figure 12.2(b) multi- 
plied by Figure 12.3(b). The intensity plot as a function of y for double slit diffraction 
is shown in Figure 12.4, where the complete intensity pattern is given in blue, and the 
overlying single slit intensity pattern is given in red. Notice that when the distance 
between the slits is much less than the width of the slits, rfCfl, the intensity pattern 
becomes similar to that of a single slit. Additionally, when d is much larger than a, the 
double slit interference separation becomes very small and is difficult to resolve from the 
single slit diffraction. In Figure 12.4 the width of the slits is twice the distance between 
the slits, a = 2d. 

There is one final comment that needs to be made about double slit interference and 
diffraction, and single slit diffraction. Throughout the entirety of this chapter we have 
assumed that light is purely a wave, and that Huygens was correct. It turns out that 
this is not the case as shown by Einstein's famous photoelectric experiment. With 
this experiment, Einstein demonstrated that discrete quanta of light, photons, carry 
an energy hf where / is the frequency of the light and h is Planck's constant. The 
argument that physicists had thought resolved since 1802 was yet again revived, and the 
idea that light is both a particle and a wave was developed. This idea is now known as 
the wave-particle duality of light, and applies not only to light, but everything. 

The investigation of the wave-particle duality of light has lead to many breakthroughs 
in the field of quantum mechanics. One of the most interesting results is that every 
particle, and consequently all matter, is made up of waves. The wavelength of these 



4 For readers who do not trust me, do the derivation yourself. Use the exact same method as the single 
slit, but now perform two integrals. Define a as the slit width and d as the distance between the 
middle of both slits. For the first slit the limits of integration will be from to a. For the second 
integral the limits will be from d — a/2 to d + a/2. Then enjoy slogging through all the math! 
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Figure 12.4: Intensity pattern for double slit diffraction. Here, a = 2d. 
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12.5 Experiment 

waves is given by the de Broglie wavelength, A = h/p where p is the momentum of 
the object. We are not able to observe these waves in everyday objects such as cars 
because the momentum of these objects is so much larger than that of electrons or 
photons that the wavelength is tiny in comparison to elementary particles. 

Because the waves of particles are governed by quantum mechanics, the results of some 
double slit examples are rather mind-bending. For example, electrons were shot through 
a double slit experiment, and a time lapse photograph of the results was made. The 
interference patterns of Figure 12.4 were observed, along with the individual impacts of 
the electrons! The same experiment was performed again, but it was ensured that only 
one single electron passed through a slit at a time. The interference pattern remained, 
despite the fact that there were no other electrons to interfere with! Finally, electrons 
were passed through the slits one at a time, and one of the two slits was covered so 
the experimenters knew from which slit the electron emerged. When this was done, the 
interference pattern disappeared! This incredible result provides direct evidence of the 
hypothesis in quantum mechanics that the mere action of observing a particle changes 
the particle, even if the measurement was non-destructive. 

12.5 Experiment 

The experiment associated with this chapter consists of three parts: in the first part the 
width of a single slit is measured by measuring the distance between first and second 
order minima. The intensity pattern should look very similar to the pattern given in 
Figure 12.3(b), and so the slit width a can be determined using Equation 12.26. In 
the second part of the experiment, the width of a human hair is measured. This is 
accomplished by what is known as Babinet's principle. 

Babinet's principle states that the diffraction pattern for any combination of slits (or 
more generally shapes) is the same as the diffraction pattern for the exact opposite 
setup, i.e. each opening is replaced with a blocking, and each blocking is replaced with 
an opening. For the case of a human hair, the exact opposite setup is a single slit with 
the width of a human hair. But how exactly does this work? 

Consider first the electric field present from just shining a laser onto the projection 
screen, let us call this E u . Then consider blocking the light with a human hair and call 
this electric field Eh- Finally, we block the light with the exact opposite of the human 
hair, a single slit the width of the human hair, and call this electric field E s . We can 
write the equation, 

E u = E h + E s (12.28) 

or the total unobstructed electric field is equal to the electric field not blocked by the hair, 
plus the electric field not blocked by the slit. Now we make a rather bold assumption and 
say that E u ~ 0. If this is the case, then Eh = —E s and since intensity is proportional 
to the square of the electric field, i^ = I s ! 

But by now, alarm bells should be going off. How can we possible justify E u ~ 0? If 
this were the case we would not be able to see any interference pattern! The justification 
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behind this assumption is a bit unsatisfying, but the end result can be experimentally 
verified. The idea is that for Fraunhofer or far-field diffraction and interference the 
projection screen is so far away from the initial light source that the light has spread 
sufficiently to yield an electric field of ~ 0. Because the light is spreading out in spherical 
waves, the electric field decreases by a factor of 1/r, and so for a very large r the electric 
field is virtually non-existent. This assumption, and consequently Babinet's principle 
does not apply to near-field or Fresnel diffraction. 

The final part of the experiment consists of measuring the separation between two 
slits in a double slit diffraction (not interference) experiment similar to that of Figure 
f 2.4. This means that a ~ d and so the single slit diffraction and double slit interference 
patterns will both be visible. Make sure not to confuse the two patterns; oftentimes the 
double slit interference pattern is very small, and must be observed using a magnifying 
glass. 
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While exploring the phenomena of diffraction in Chapter 12, we stated that the "first" 
resolution of the light wave-particle duality was arrived at by the experimental confir- 
mation of Young's slit experiment that light was a wave. But what about the "second" 
resolution? This chapter first looks at the classical theory of light as a wave, first ver- 
ified by Young's experiment, and then given a theoretical groundwork by James Clerk 
Maxwell. The theoretical framework for light and electromagnetism, Maxwell's equa- 
tions, provide an incredibly accurate theory, yet at the turn of the 20 century problems 
began to arise. Specifically, Albert Einstein began to question the purely wave nature 
of light, and hypothesized his famous photoelectric effect theory which stated that light 
was made up of particles called photons. 

With the photoelectric effect, Einstein postulated a famous relationship between the 
energy of a photon and its frequency, which requires the use of what is known as Planck's 
constant. This constant had been experimentally determined by Max Planck a few years 
earlier through his successful attempt to model what is known as black-body radiation. 
In his model, Planck required the quantization of light, yet he did not truly recognize the 
implications of this quantization until Einstein's photoelectric effect. The final section 
of this chapter briefly explores the theory of black-body radiation in an attempt to put 
the history of the constant into context. 



13.1 Maxwell's Equations 

In Chapter 11, while deriving the Fresnel equations we learned that not only is light a 
wave, but that it is an electromagnetic wave, or an electric wave, E, perpendicular to a 
magnetic wave, B travelling in the direction E x B. Furthermore, we learned that the 
intensity of light is proportional to the amplitude of the electric wave squared, Eq. From 
Young's slit experiment know that light must be a wave, but how do we know that light 
is made up of a magnetic and electric field, and how do we know that the intensity of 
light is the amplitude squared of the electric field? To answer these questions we must 
turn to what are known as Maxwell's equations. 

During the mid 1800's physicists such as Gauss, Faraday, Ampere, and Maxwell had 
become increasingly interested by both electrical and magnetic phenomena, and through 
experimental trial and error, determined the two phenomena were governed by the same 
force, or the electromagnetic force. Maxwell went even further, and united four laws 
that governed this force, Maxwell's equations, which are given in a differential form 
by Equation 13.1 and an integral form by Equation 13.2. 
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V-E=P- (13.1a) ttE.dA=® (13.2a) 



eo 



V e o 



VxB = — - - (13.1b) <h E-dl = ^ (13.2b) 



0* ./4 0* 



V-£ = (13.1c) (tiB-dA = (13.2c) 



V x B = ix J + U.Q6Q— (13.1d) * B-dl = u. I-n. e — (13.2d) 



At first glance these equations are very intimidating, but after getting past all the 
symbols, their meaning is quite simple and elegant. To begin, we must first explain 
the difference between the differential and integral form of the equations. Both sets 
of equations, Equations 13.1 and 13.2, represent the exact same physical laws, but are 
different mathematical and physical interpretations of the laws. Oftentimes one form is 
taught in class, and the other form is ignored, but both forms of Maxwell's equations 
provide important physical insights. More importantly, when solving electromagnetism 
problems, choosing the appropriate form of the law can greatly simplify the math behind 
the problem. 

Looking at Equation 13.1a we see that the divergence of the electric field, V • E, is 
equal to the charge density, p, divided by the electric constant. The divergence of 
a vector field, F, is given mathematically by dF x /dx + dF y /dy + dF z /dz and is a scalar 
quantity (a number not a vector). Visually, the divergence of a field is the magnitude per 
unit area of all the vectors passing in and out of a surface drawn around a single point 
in the field. For example, the divergence of the electric field from a single electron would 
be the magnitude per unit area of all the electric field vectors passing through a sphere 
drawn around the electron. In this example, because electrons have a negative charge, 
the divergence of E would be negative. This makes sense, as p should be negative as 
well. 

The definition of divergence relates directly to the integral form of Equation 13.1a 
given by Equation 13.2a. This equation states that if a volume of any shape, V, is drawn 
around some amount of enclosed charge Q, the surface integral of the electric field is 
equal to the enclosed charge, divided by the electric constant. The surface integral is 
given by integrating over the dot product of the electric field with the infinitesimally 
small piece of area, dA, through which the electric field is passing for the entire volume. 
Notice that dA is a vector quantity, and the direction of dA is perpendicular or normal 
to the surface of the volume at that point. Oftentimes a surface can be chosen so that 
E and dA are perpendicular and consequently E ■ dA is just |.E||gL4|. 
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13.2 Electromagnetic Waves 

Moving onto Equation 13.1b, we see that curl of the electric field, V x E, is equal 
to the negative of the partial derivative of the magnetic field with respect to time. For 
readers not familiar with the curl of a field, look at Chapter 17. Briefly, the curl of a 
vector field is how much the vector field is curling (like a whirlpool) about a certain 
point. To understand what this means physically, we turn to the integral form given 
by Equation 13.2b, which states that the line integral of the electric field is equal to 
the opposite of the change in magnetic flux over time through the area, A, of the line 
integral. As an example, consider an electromagnet that is becoming more powerful over 
time. If we draw a circle around the electromagnet, the dot product of the electric field 
going through the circumference of the circle is equal to the change in the magnetic field 
going through the area of the circle. In this case the magnetic field is growing, and so 
this means the electric field is pointing inwards towards the electromagnet. 

If we now look at Equations 13.1c and 13. Id we see that their left hand sides look 
very similar to Equations 13.1a and 13.1b but with B swapped in for E. The same 
observation applies to the integral forms as well. But, the right hand sides of these 
equations do not match. This is because electric fields are caused by electric charges, 
such as the electron or positron, which can be either negative or positive. Magnetic 
fields, however, are caused by magnets which always consist of a north and south pole. 
The idea of a magnet that is only a north pole, or only a south pole is called a magnetic 
monopole and has not yet been experimentally observed. Equations 13.1c and 13.2c 
essentially state this fact: whenever a magnetic field is enclosed, the total "magnetic 
charge" has to be zero because there are no magnetic monopoles. 

The same idea applies to Equation 13. Id. Here, electric charge can flow (think elec- 
trons) and so the term HoJ, the magnetic constant times the current density, must 
be included. The analogous quantity for magnetism does not exist, and so this term is 
missing from Equation 13.1b. In the integral form, Equation 13. 2d, current density is 
just replaced with total current flowing through some line integral. 

13.2 Electromagnetic Waves 

Entire books have been written just to explain the application of Maxwell's equations, 
so it is not expected that the brief outline above is anywhere close to a full explanation. 
However, hopefully it provides a general idea of how the equations work, and their inter- 
relations. For readers curious in learning more about Maxwell's equations and how to 
use them to solve electromagnetism problems, the book Introduction to Electrodynamics 
by David J. Griffiths provides an excellent introduction. 

But back to our original questions, how do we know light is an electromagnetic wave, 
and why is the intensity of light proportional to amplitude squared? If we combine 
Equation 13.1a with Equation 13.1b we can arrive at, 

d 2 E 
V 2 E = W)eo^ (13.3) 

where a few steps have been skipped in the process. Hopefully this equation is vaguely 
familiar. Looking back all the way to Chapter 10, we see that Equation 13.3 satisfies a 
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three dimensional form of the wave equation, given by Equation 10.7! Now here's the 
exciting part. We know that the coefficient on the right hand side of Equation 13.3 is 
equal to one over the velocity of the wave squared, by using Equation 10.9. This gives 

us, 

C = /x eo = -* => v = = 3.0 x 10 8 — (13.4) 

which is the speed of light. We can also repeat the same proces using the equations 
governing the magnetic field, Equations 13.1c and 13. Id, and again arrive at a velocity 
equal to the speed of light. Because of these relations, we know that light waves are 
made up of electric fields and magnetic fields and can predict the speed of light because 
Maxwell's equations satisfy the wave equation! 

So that answers the first question, but how do we know that the intensity of light 
(or the energy of light) is proportional to the amplitude of the electric field squared? 
We now know that a light wave consists of a electric and magnetic field, and it turns 
out that the energy density flux of an electromagnetic field is given by the Poynting 
vector, 

-? E x B , - l9 , 

S= = ce \E\ 2 (13.5) 

Mo 

where we are able to perform the second step because we know E is perpendicular to 
B and E = cB where c is the speed of light. The Poynting vector tells us the energy, 
or intensity of light, is proportional to the amplitude of the electric field squared. Of 
course where the Poynting vector comes from has not been explained, but that requires 
quite a bit more theory, and is left to the more intrepid readers to find out on their own. 

13.3 The Photoelectric Effect 

By the turn of the 20 century, problems with the interpretation of Maxwell's equations 
had begun to arise, and the question of whether light was merely a wave arose yet 
again. One of the major problems observed was with an experiment known as the 
photoelectric effect. In the photoelectric effect, a strong source of monochromatic 
light is shown onto a metallic surface. The light strikes electrons within the metallic 
surface, and some electrons are ejected. These ejected electrons can be measured, as 
they create a voltage which can be read by a voltmeter. 

The electrons ejected from the surface have the kinetic energy, 

KE max = E 1 -4> (13.6) 

where E y is the energy from the incident light striking the electron, and (f> is the work 
function, or energy required to eject the electron from the surface. The work function 
of the surface is dependent upon the material the surface is made from. While reading 
through Chapter 14 it is possible to estimate the order of magnitude for a typical work 
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function by determining how much energy is required to remove the most tightly bound 
electron from a hydrogen atom. 

Work function aside, the kinetic energy of the ejected electrons can be measured by 
passing the electron through an electric field. Within an electric field, the electrons 
feel a force governed by Coulomb's force (see Equation 14.2) from the electric field. By 
Newton's first law, we know that if a force is applied to the electron, the electron must 
decelerate, and if the electric field is large enough, the electron will come to a stop. 
The electric field, when it brings all ejected electrons to a stop, is called a stopping 
potential. 

By using this information, it is possible to determine the maximum kinetic energy of 
the electrons being ejected from the surface. When the electric field is increased until 
no electrons are able to pass through the field then, 

KE max = V s e (13.7) 

or the maximum kinetic energy of the ejected electrons, KE m3iX , is equal to the voltage 
of the stopping potential, V s , times the charge of an electron, e. 

Now, by Maxwell's equations, we know that if we increase the amplitude of the incident 
light, we increase the intensity of the light, and we also increase the energy of the light. 
This means that classically, the kinetic energy of the escaping electrons should increase 
if we increase the intensity of the light. If we change the frequency or color of the 
light, nothing should happen, the escaping electrons should have the same maximum 
kinetic energy. Unfortunately, this effect of an increase in electron kinetic energy with an 
increase in incident light intensity was not observed when the experiment was performed. 
Rather, three disturbing phenomena were observed. 

1. An increase in light intensity increased the number of ejected electrons, but did 
not increase the maximum kinetic energy of the ejected electrons. 

2. An increase in the frequency of the light increased the maximum kinetic energy of 
the ejected electrons, but did not increase the number of ejected electrons. 

3. If the frequency of the light was too low, no electrons were ejected, no matter how 
bright a light was used. 

This baffled physicists, as these results did not match with Maxwell's equations. Albert 
Einstein, however, looked at the results, and decided that perhaps light was not just a 
wave, despite the rather overwhelming evidence of Young's experiment and the incredible 
success of Maxwell's equations. Instead, he postulated that light is made up of particles 
or quanta called photons. More importantly, he theorized that the energy of a photon 
is proportional to the frequency of the photon, or, 

E 1 = hf (13.8) 

where E y is the energy of the photon, h is Planck's constant, and / is the frequency of 
the photon. 
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Using Equation 13.8, we can now explain the odd results of the photoelectric exper- 
iment. Rather than thinking of an incident wave of light striking an electron, we can 
think of a single photon striking an electron. If we increase the number of incident 
photons, we increase the intensity of the light, and consequently the number of ejected 
electrons increases. The kinetic energy of the ejected electrons does not change however, 
because the energy of the incident light striking an electron is just the energy of a single 
photon and does not change without changing the frequency of the light. We have just 
explained the first observation! 

But what about the second observation? The same logic applies. Now we do not 
increase the intensity of the light, so the number of photons stays the same, but we do 
increase the frequency and so by Equation 13.8, the energy of each photon is increased. 
Going back to Equation 13.6, E^ increases, and so the maximum kinetic energy of the 
electrons must also increase. If however, the frequency of the incoming photons is not 
high enough, the energy of the photon will be less than the binding energy, E^ < cf>, and 
so the electron will have negative kinetic energy! This of course does not make sense, as 
kinetic energy cannot be negative; it must be zero or positive. 

What this does mean though, is that the electron absorbs less energy than is needed to 
eject it from the surface. The electron will be less strongly bound to the surface, but it 
cannot escape, so no ejected electrons will be observed if E^ < cf>, and so we now have an 
explanation for the third observation. Of course, it is interesting to know what happens 
when the electron is not ejected from the surface, but its binding energy is decreased. 
This is covered more in the chapter on fluorescence, Chapter 15. 

In Chapter 12 it was mentioned that Einstein did not suggest that light was just a 
particle, but rather he suggested that it was both a particle and a wave. The implications 
of this wave-particle duality were already discussed in Chapter 12, but it is important 
to remember how Einstein developed this new idea. The experimental evidence for light 
as a wave was overwhelming, with Maxwell's equations providing a strong theoretical 
groundwork, yet the photoelectric effect demonstrated light must be made of particles. 
Einstein saw what no other physicist at the time could see; if light exhibits the behavior 
of a wave, and the behavior of a particle, it must be both a particle and a wave! 



13.4 Blackbody Radiation 

As stated in the introduction to this chapter, Einstein demonstrated the quantum na- 
ture of light, yet Planck inadvertently required the quantization of light through the 
black-body radiation experiment and measured his constant, h, before realizing the 
full implications of the constant. So what is a black-body? A black-body is an object 
that absorbs all electromagnetic radiation, both visible and invisible. Hence, it is given 
the name black-body because it cannot be seen from reflected light. However, after 
absorbing electromagnetic radiation, the black-body re-emits light on a variety of fre- 
quencies, dependent upon the temperature of the black-body. Perfect black-bodies do 
not exist in nature, but both the sun, and the human body act like black-bodies for 
certain frequencies of light! 
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The theory behind black-body radiation is based on statistical mechanics, and a variety 
of approaches can be taken. The theory behind these approaches will not be explained 
here, as the math can be very involved, but the resulting equations will be given. The 
first equation describing black-body radiation was first proposed by Wein, and later 
derived by Planck and is, 

I(f,t) = ?£*-% (13.9) 

cr 

where /(/, t) is the intensity of the light at a frequency / emitted from a black-body 
with a temperature T. The constant k is Boltzmann's constant, c is the speed of 
light, and h is Planck's constant. The units of I are Joules per square meter. 

It was found that this equation matches black-body radiation well for high frequencies, 
/, but deviates significantly from experiment for low frequencies. Planck went back to 
the drawing board and postulated Planck's law, 

IU\T)= 2h f N (13.10) 

c 2 e kr — 1 



which is very similar to Equation 13.9, but matches experiment for both high and low 
frequencies. In his derivation of Equation 13.10, Planck required that light obeyed 13.8 
but without realizing that he had quantized light! 

Finally, a third model based entirely on classical mechanics by Jeans and Raleigh 
stated, 

I(f,T) = 2 -^- (13.11) 

cr 

but experimentally, this formula only matches the intensity of lower frequencies of light. 
The formulation of this theory allowed physicist to understand the importance of us- 
ing quantum mechanics to explain black-body radiation. Without the Jeans-Raleigh 
equation, Planck would not have realized the implications of his quantization of light. 

A comparison between Equations 13.9, 13.10, and 13.11 is given in Figure 13.1. Equa- 
tion 13.10 correctly describes the spectrum of light from black-body radiation, while 
Equation 13.9 describes the radiation well for high frequencies, and 13.11 for low fre- 
quencies. The black-body radiation spectrum given in Figure 13.1 is for an object with 
a temperature of 310 K, the average temperature of the human body. The light emitted 
from the human body, due to black-body radiation, is on the infrared scale. This is how 
many types of night vision systems work; they detect the black-body radiation emitted 
from the human body in the infrared spectrum. 

13.5 Experiment 

The experiment for this chapter demonstrates the photoelectric effect, using the method 
discussed in the third section of this chapter. Luckily, all of the complicated apparatus 
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13 Planck's Constant 
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Figure 13.1: An example of the black-body radiation given off by the human body at 310 
K. The green curve gives the incorrect classical theory of Equation 13.11, 
the red curve the incorrect theory of 13.9, and the blue curve the correct 
theory derived by Planck of Equation 13.10. 
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13.5 Experiment 

required during the early 1900's to create a stopping potential and measure the maximum 
kinetic energy of the ejected electrons can be condensed into a small black box with the 
magic of circuits and a few other clever ideas. The entire apparatus for the experiment 
consists of this black box, a bright ultraviolet lamp, and various colored filters that can 
be used to change the frequency of the light incident on the box. 

Within the black box is a photodiode, essentially a surface that ejects electrons when 
struck by photons. This surface has some work function, <f>, which is unknown, and must 
be determined from the data taken during the experiment. The ejected photons are then 
driven into a capacitor which creates an electric field. Over time the electric field within 
the capacitor increases until it reaches the stopping potential, and then stabilizes at this 
value. A voltmeter is connected to this capacitor and measures the voltage drop across 
the capacitor, giving the stopping potential. 

Using the voltmeter and various filters, a variety of data points can be taken of light 
frequency and measured stopping potential. By combining Equations 13.6, 13.7, and 
13.8 we can write, 

V s = -f-£ (13.12) 

e e 

or the stopping potential is equal to the Planck's constant times frequency less work 
function, all over fundamental electric charge. If the theory above is correct, then we 
can plot the data taken with V s on the y-axis and / on the x-axis to obtain a relationship 
governed by Equation 13.12. The slope of the plot will yield - while the intercept of the 
graph will give ^, and so if e is known, it possible to calculate both Planck's constant, 
and the work function of the apparatus from the data! 
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14 Hydrogen 



Over a century ago Lord Kelvin, while addressing a room full of the world's leading 
physicists, stated, "There is nothing new to be discovered in physics now. All that 
remains is more and more precise measurement." 1 As it so happens, Kelvin was wrong, 
as Rutherford and Bohr would so spectacularly demonstrate in the following years in 
the field of spectroscopy. 2 

At the beginning of the twentieth century most physicists would describe the struc- 
ture of an atom with what is known as the plum-pudding model. In this model the 
atom is represented by a fluidic "pudding" of positive charge with "plums" of electrons 
floating around inside. A scientist of the time, Ernest Rutherford, questioned this expla- 
nation and decided to investigate the inner structure of the atom with his revolutionary 
scattering experiment. In this experiment alpha particles were fired at a thin gold 
foil. If the plum-pudding model was accurate most of these alpha particles would be 
reflected back, but Rutherford found the exact opposite. Nearly all the alpha particles 
passed through the foil undeflected, indicating that the gold atoms consisted mainly of 
. . . nothing. 

To explain this astonishing result, Rutherford introduced a new model for the structure 
of the atom called the Rutherford model. In this model the atom consists of a very 
hard and dense core (the nucleus) about which electrons orbited. This model, while 
describing the phenomena which Rutherford observed in his scattering experiment, still 
suffered from a variety of flaws. In an attempt to correct these flaws a scientist by the 
name of Neils Bohr introduced the Bohr model in 1913. This model enjoyed great 
success in experimental observation and is taught to this day as an introduction to 
the atom. The model suffers from a variety of drawbacks and has been superseded by 
the atomic orbital model of the atom, but is still useful in conceptualizing spectral 
emissions of simple atoms. 

14.1 Bohr Model 

So what exactly is the Bohr model? The Bohr model is a planetary model that describes 
the movement of electrons about the nucleus of an atom as is shown in Figure 14.1. Here, 
electrons (analogous to planets) orbit about the charged, dense nucleus of the atom (the 
star of the planetary system). More importantly, the Bohr model is a combination of 
classical and quantum theory with surprisingly accurate results for the hydrogen atom. 



^eisstein, Eric. " Kelvin, Lord William Thomson (1824-1907)". 

2 Lord Kelvin was wrong about quite a few things. A quick Google search will reveal pages and pages 

of rather humorous quotes from him. Despite his incredible failures in prediction, he pioneered many 

techniques in thermodynamics. 
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14 Hydrogen 




Figure 14.1: The Bohr model for a hydrogen atom with electrons at principal quantum 
numbers of n = 2 and n = 3. The symbol e~ denotes an electron and p + a 
proton. 



Don't be disconcerted by the word quantum here, it will be explained shortly. But first, 
we will begin with the classical portion of the theory and apply it to the hydrogen atom 
which consists of a single negatively charged electron, and a single positively charged 
proton (plus a neutron but we don't care about that). 

Any object in regular circular motion experiences a force stopping it from flying away, 
whether that force is gravity, tension, or electromagnetism. This centripetal force can 
be written in terms of mass of the orbiting object m, the radius of the orbit r, and the 
tangential velocity of the object v. 3 



F 



mv 



(14.1) 



In the case of a negatively charged electron orbiting a positively charged nucleus, the 
centripetal force is due to electromagnetism and is described by the Coulomb force. 

9192 



F 



Ane^r 2 



(14.2) 



Here eo is the electric constant 4 , q\ the charge of the electron, and qi the charge of 
the proton, while r is the distance between the two. Setting Equation 14.1 equal to 



' There are a variety of methods to derive this formula, one being geometric. Try writing the period for 
one orbit of the object in terms of velocity and distance traveled, but also in terms of velocity and 
acceleration and equate the two. 
This constant is known by many names, permittivity of a vacuum, as used in the lab manual, per- 
mittivity of free space, etc. The bottom line is that this constant allows us to convert from units of 
charge squared over distance squared into force. 
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14.1 Bohr Model 

Equation 14.2 we can find a value for the velocity of the electron in terms of r, €q, the 
charge of the electron e, and the mass of the electron m e . 



m e v 2 e 2 e 2 , , . 

- £ — = ~A 2 =* V = \/ i 14 - 3 
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This is where we now need to use a very basic form of quantum mechanics. Previously 
in the Rutherford model, Rutherford allowed electrons to take on any possible velocity. 
The only problem with this is that when electrons accelerate they lose energy by releasing 
energy in the form of a photon. 5 An electron orbiting a nucleus in circular motion 
is constantly experiencing an acceleration towards the nucleus. Hence, the electrons 
orbiting a nucleus must be radiating energy, and so all atoms must be continuously 
emitting light! 6 As everything around us does not usually glow, this is clearly not the 
case. To provide a solution to this dilemma, Bohr suggested that the the electrons could 
only take on discrete energies, and that when at these energy levels the electrons would 
not lose their energy to radiation unless forced into a different energy level. 

The idea that the energy of the electrons is quantized, or only allowed at specific 
levels, is a very rudimentary form of quantum mechanics. The reason Bohr suggested 
this rule was not because he had incredible foresight into the intricacies of fully developed 
quantum mechanics, but because such a rule would provide a theory that would match 
experiment. Specifically, atoms in gasses had been observed emitting light at discrete 
energies, rather than a continuum of energies. 

Despite the reasoning behind Bohr's rule of quantizing the energy, the result is that 
the electrons can only take on quantized values of angular momentum. 

nh 

L = rp sin9 = rm e v = — (14.4) 

2-7T 

Here the letter n, called the principal quantum number, is a positive integer greater 
than zero, i.e. 1, 2, 3, etc., and h is the Planck constant. Because momentum is 
quantized we see that both the velocity and the radius at which the electron orbit the 
nucleus must also be quantized. We can plug in the value that we obtained for the 
velocity of the electron (using classical mechanics) into Equation 14.4 and solve for the 
radius in terms of the principal quantum number n, h, the mass of the electron m e , the 
electric constant eo, and the fundamental unit of charge e. 



e 2 nh t§n 2 h 2 

m e 47reor 2tt '' irm e e 2 



rm e\l — A = ^T =*• r n = o2 (14.5) 



In the second step we have replaced r with r n to indicate that the value of r is entirely 
dependent upon the principal quantum number n. 



5 For those of you who are interested, the idea of light being quantized in the form of a photon was 

first proven by Einstein in his photoelectric effect experiment. 
6 For more details on electrons radiating energy under acceleration look up synchrotron radiation 

and the Larmor formula. 
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14 Hydrogen 

Looking at Equation 14.5 we see an incredible result. The radius at which an electron 
orbits a nucleus is only dependent on n! What is even more exciting is that the radius at 
which the electron orbits the nucleus gives us the effective size of the atom. We can see 
what the smallest size of the atom is by letting n = 1. We call this the Bohr radius, 
or ri. 

h 2 
n = ° , » 0.5 x 1CT 10 m (14.6) 

7rm e e z 

From first principals in classical mechanics and a little help from quantum mechanics, 
we have derived the size of a ground state hydrogen atom! 

14.2 Spectroscopy 

The idea that we can calculate the fundamental size of the hydrogen atom from Equation 
14.5 is exciting, but is rather difficult to experimentally verify; we can't just grab a ruler 
and go measure the distance of an electron from the nucleus. Thankfully, there is another 
way that we can experimentally verify our theory and that is through spectroscopy. 7 
Many people know Einstein for his work on general and special relativity, but less 
commonly known is that Einstein never won the Nobel prize for this work. Instead, he 
won the Nobel prize for what is known as the photoelectric effect. What Einstein 
demonstrated is that light is made up of tiny massless particles, called photons, and that 
the energy of each photon is directly proportional to its frequency 8 

£ 7 = hf (14.7) 

Here we have used the subscript 7 (the Greek letter gamma) to denote the photon. What 
this means is that light with a high frequency is more energetic than light with a low 
frequency. For example, a beam of blue light would contain almost nearly one and a half 
times as much energy as a beam of red light. 

But how does this help us with experimentally verifying Bohr's model? As the elec- 
trons orbit the nucleus, they have a potential energy dependent upon their orbital radius 
and a kinetic energy dependent upon their velocity. The total energy of each electron 
is just its kinetic energy plus its potential energy. The kinetic energy for the electron is 
2m» 2 , while the potential energy is the electric potential energy between the proton of 
the nucleus and the electron. The electric potential energy for two charges (exactly like 
this case) is given by, 

V = Sm. (14.8) 

47re r 

where q\ and qi are the charges, eo the electric constant, and r the separation between 
the two charges. Notice that this relation is nearly identical to the Coulomb force given 



Spectroscopy is literally the observation of a spectrum, in this case, a spectrum of light. 
This equation is more commonly written as E = hv but for notational consistency we have stayed 
with designating frequency with the letter / instead of v. 
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14.2 Spectroscopy 

in Equation 14.2. This is because the integral of force over distance is just energy and 
so Equation 14.8 can be found by integrating Equation 14.2 with respect to r. 

Putting this all together, we can add our standard relation for kinetic energy to our 
potential energy V to find the total energy of the electron as it orbits the nucleus. 



1 2 
E n = ^m e v + V 
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In the first step, we have just written the formula for the total energy, kinetic energy 
plus potential energy. In the second step we have replaced v with Equation 14.3 and V 
with Equation 14.8. In the third step we have replaced r n with Equation 14.5. In the 
final steps we have just simplified and then plugged in the values for all the constants. 
Oftentimes when dealing with energies on the atomic (and sub-atomic) scale we will 
express energy in terms of electron volts (eV) instead of joules. An electron volt is the 
energy an electron gains when it passes through an electric potential of one volt, hence 
the name electron volt. Equation 14.9 is what we call the binding energy. 

The first important result to notice about Equation 14.9 is that the total energy of 
the electron is negative! This means that once an electron becomes bound to a hydrogen 
nucleus, it can't escape without external energy. The second result is that for very large 
values of n the binding energy is very close to zero. This means that electrons very far 
away from the nucleus are essentially free electrons. They do not need much help to 
escape from the nucleus. 

The question still remains, how can we use the results above to verify our theory, and 
how does spectroscopy come into play? The missing part of the puzzle is what electrons 
do when they transition from a near zero binding energy (for example n = 20) to a 
large negative binding energy (such as n = 1). For the electron to enter a smaller orbit 
with a large negative binding energy it must give up some of its energy. It does this by 
radiating a photon with an energy equal to the difference in energies between the level 
it was at (initial level) and level it is going to (final level). 




£7 = Ei - E f = -^ -j - -j) (14.10) 



Here E^ is the energy of the photon, Ei the initial energy for a principal quantum number 
m, and Ef the final energy for a principal quantum number n/. 
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You can think of it as the electron having to pay an entrance fee (the photon) to enter 
an exclusive club (a more central radius). The more "inner circle" the club (in this case 
a smaller radius) the more expensive the cost, and so a more energetic photon must be 
given up. Of course in this example, the most exclusive club is the Bohr radius, r±, and 
the cost is 13.6 eV if entering from the outside. 

In the club example, we can tell the difference between a 5 euro note and a 10 euro 
note by color. We can do the exact same with the electrons because of Equation 14.7! A 
high energy electron will have a high frequency (for example, blue) while a low energy 
electron will have a low frequency (for example, red). What this means is that when 
we watch a hydrogen atom being bombarded with electrons, we should continually see 
photons being thrown off by the electrons in their attempts to get to a smaller radius. 
In the club analogy, think of a crowd of people jostling to get into the exclusive section 
of the club, waving money in their hands to get the attention of the bouncer. 

When we are in a club, we know the admission prices. Here with the electrons, we 
also know the admission prices, but now the price is given by a frequency or wavelength 
instead of euros. 



s 7 = 


hf 






/ = 


m e e A 1 


, n 1 


1 N 


8e 2 h 3 \ 


n )j 


/ = 


c 

A 






1 


4 

m e e 


w 


1 


A 


8ce 2 h 3 


n) 



14.11) 



In the first step we have just used Equation 14.7. In the second step we have used 
Equation 14.10 to write E^. In the third step we have just written the normal relation 
between frequency and wavelength, / = v/X. 9 Notice that this equation is just Equation 
1 of the lab manual. From this we can solve for the Rydberg constant which is just 
the constant of proportionality in front of this equation. 

4 

R = meC - m 1.1 x 10 7 m" 1 = 0.011 nm" 1 (14.12) 

We have just derived Equation 3 in the lab manual! 

14.3 Transition Series 

Now that we know that electrons within the hydrogen atom will emit photons of a 
specific energy, frequency, and wavelength, as given in Equation 14.11, we can make an 
experimental prediction. We should observe a spectrum of photons with wavelengths 
dictated by their initial quantum number m and their final number rif emitted from 



See the waves lab for more details if this relation is unfamiliar. 
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Figure 14.2: The first nine transitions of the Balmer series. The colors approximate the 
color of the emitted light of each transition. 



hydrogen atoms being bombarded with electrons. These spectrums, categorized by the 
final quantum number nj are called transition series; the electron is going through 
a series of transitions from energy level to energy level. Several important transition 
series are named after their discoverers: the Lyman series corresponding to nj = 1, 
the Balmer series corresponding to nj = 2, and the Paschen series corresponding to 
n/ = 3. 

The most important of these three transition series is the Balmer series, as the 
wavelengths of the emitted photons are within the visible spectrum of light for the 
human eye. As rif = 2 we can write the energy of the emitted photon for an electron 
transitioning from principal quantum number nj using Equation 14.10. 
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(14.13) 



We can make a diagram of the Balmer transitions, as given in Figure 14.2 with the 
corresponding approximate color of the light emitted. 10 The first seven transitions, 
rij = 3 through rij = 9 have their wavelengths, frequencies, and energies summarized in 
Figure 14.2 as well. 

From Figure 14.2 we see that only the first seven or so transitions of the Balmer 
series should be visible to the naked eye, and that as n^ increases, the colors of the light 
are more and more difficult to distinguish. This means that with minimal diffraction 
equipment it is oftentimes only possible to observe the first three lines, a red line, a 
bluish green line, and a dark blue or purple line. 



10 Color is a very subjective principal, as it involves an emitter and an observer. These colors are only 
meant to provide a relative approximation of the color which would be observed by the naked eye 
from the Balmer series. The colors were created using the Spectra program. 
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14 Hydrogen 

14.4 Diffraction 

But what do we mean by "observe a line"? When electricity is run through hydrogen 
gas the gas begins to emit photons from all the different transitions discussed above. 
The photons all mix together, and the experimenter is confronted with a pink glow from 
the hydrogen lamp. How is it then possible to separate the photons from each other 
so we can observe the various wavelengths? The answer is diffraction, which while 
introduced in Chapter 12, requires a little more explanation. 

When light enters a medium (anything besides a complete vacuum) it encounters 
resistance from that medium that slows the light wave down. The energy of the light 
wave (made up photons remember) must stay the same, as energy just can't disappear. 
Using Equation 14.7, we see that if the energy stays the same, the frequency must also 
stay the same. This means that if the velocity of light within the medium decreases 
then the wavelength of the light must increase. When the wavelength of light is changed 
while the frequency remains constant, the light is diffracted, or bent. More energetic 
light is bent less than low energy light, and so the combined colors in white light are 
separated. This is the principal by which the prism works and is described by Snell's 
law, derived in Chapter 11, and given by Equation 11.8 which is given again here. 

^ = - (14.14) 

sin 02 i>2 

Remeber 9\ and 62 are the incidence angle, the angle at which the light is incident 
to the normal of the medium, and refraction angle, the angle at which the light is 
refracted or bent. The values v\ and V2 are the velocities of light in medium 1 and 
medium 2. 

The same idea is used in diffraction gratings, but now the diffraction is accom- 
plished through interference patterns. A diffraction grating is made up of many small 
parallel lines through which light passes. The various wavelengths interfere with each 
other and create bright bands of colors. This is why when looking at a CD, bright colors 
are observed. The theory behind diffraction gratings can take a fair amount of math 
and waves knowledge to understand, so an in depth presentation is not given here, but 
left to the reader to explore. An excellent book that covers this topic is Vibrations and 
Waves by A. P. French. The important relation to remember for the diffraction grating 
is, 

m\ = dsm9 m (14.15) 

where A is the wavelength of the light being observed in meters, d the number of slits per 
meter, m some integer greater than zero, and 9 m the angle at which the light is observed 
with respect to the normal of the diffraction grating. What this equation tells us is that 
for a specific A we expect to see a bright bright band of this color at regular intervals in 
9. Subsequently, by observing a color band at some angle 9 we can determine A. 

Figure 14.3 plots the intensity of the first three Balmer lines, given in Figure 14.2 as 
they pass through a diffraction grating. The grating used in this plot has 600 lines per 
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Figure 14.3: The intensity profile for the first three transitions of the Balmer series with 
a diffraction grating of d = 0.001/600 m/lines, w ~ d, and TV ps 1800. This 
plot was calculated using Equation 14.16. 
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millimeter and the grating is 3 cm wide. 11 Every large peak in the plot corresponds to 
a specific value for m, 6, and A, in Equation 14.15. From the plot we see that if we were 
using this diffraction grating while observing a hydrogen lamp, we would expect to see 
bright bands of color at angles of « 15°, 17°, and 24°. All of these bands correspond to 
m = 1 in Equation 14.15. 12 

14.5 Experiment 

The goals the experiment associated with this chapter are to observe the first three 
electron transitions of the Balmer series, calculate their wavelengths, verify the Bohr 
model for the hydrogen atom, and calculate the Rydberg constant. 

Those are quite a few goals to accomplish, and some seem rather complicated, but the 
actual experimental procedure is not all that time consuming. It is the understanding of 
the theory, explained above that can be tricky. This lab actually only consists of making 
three experimental measurements. 

A hydrogen lamp is placed in front of a telescope/collimating device. All the tele- 
scope/collimator does is create a bright band of light from the hydrogen lamp that we 
can observe. The light travels through the telescope to a diffraction grating, and passes 
through the diffraction grating. On the other side another telescope, attached to a scale 
that reads out angle in degrees, points at the diffraction grating. 

The idea is to look through this telescope at the diffraction grating and hydrogen 
lamp, and observe bands of light similar in color to the first three transitions of Figure 
14.2. The spacing in 6 and intensity of the light band intervals will be similar to that 
of Figure 14.3. Notice that after a very bright band there oftentimes will be a few less 
intense bands of the same, or similar color. Make sure not to record the angle for these 
bands as these are fringe effects. 

For each observed band (three for this experiment) an angle 6 is recorded. This 8 is 
plugged back into Equation 14.15 with m = 1 and A is found. The values for 1/A 2 versus 
1/n? can then be plotted against each other and a value for the Rydberg constant, given 
in Equation 14.12, determined from the slope of the plot. Additionally, the final energy 
level for the series, rif, can be determined using the Rydberg constant determined from 
the slope of the plot, and the intercept of the plot. 



These numbers were not chosen at random; they closely model the diffraction gratings that are used 
in this lab. 
12 The equation used to plot Figure 14.3 is a little more complicated than Equation 14.15 as now we are 
looking at the intensity of the light. The equation used here is, 

2 

(14.16) 



I{9) =/ sinc 2 (^sm(6>)) 
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where w is the width of the slits, d the separation of the slits, N the number of slits, and Jo the 
initial intensity of the light. 
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The Bohr model, which was derived from a combination of classical and quantum 
theory in Chapter 14, seems a little too good to be true. That's because it is, as was 
alluded to with the mention of the atomic orbital model in the previous chapter. The 
Bohr model consists of classical theory with the introduction of just a single quantization, 
the principle quantum number, n. We know that the model works well for the 
hydrogen atom, but it is by an ironic twist of physics that the Bohr model is right for 
all the wrong reasons. 1 

So why exactly is the Bohr model wrong? The first problem is that the electrons 
within an atom can move at velocities near the speed of light, and so Einstein's theory 
of relativity must be used instead. The second problem is that the type of quantum 
mechanics used is just a rudimentary form of a much larger and intricate theory. Re- 
membering back to Chapter 3, relativity describes objects moving very quickly, and 
quantum mechanics describes very small objects. For the hydrogen atom the elec- 
trons are both small and fast, so the must be described using relativistic quantum 
theory. 

In experiment, the Bohr model was found to disagree with a variety of results. Perhaps 
one of the more pronounced phenomena is the Zeeman effect, which occurs when an 
atom is placed under a strong magnetic field. When this occurs, the atom begins to emit 
many more wavelengths of light than it should and so we know the atom has more energy 
levels than the Bohr model predicts. 2 Another important phenomena is hyperfine 
splitting which allows photons to radiate from hydrogen atoms with a wavelength of 
21 cm. This wavelength is within the radio range of the electromagnetic spectrum, and 
has led to incredible advances in radio astronomy. 3 

Another phenomena that combines many of the ideas above is fluorescence. Not 
only is fluorescence somewhat challenging to spell, it is quite challenging to understand. 
In reality, theory can describe fluorescence up to a point, but there are so many facets 
to the phenomena that there is no complete theory that can accurately describe the 
fluorescence of any arbitrary atom. So what exactly is fluorescence? Fluorescence is 
when an atom or molecule absorbs a photon, for example a blue photon, and then after 
a short time period releases another photon with less energy, for this example let us say 



Yes, it really does mess with physicists heads when the theory is wrong, but agrees with experiment. 

The real question then is how do you know the theory is wrong? The answer is the theory breaks 

down when the experiment is performed in more detail. 
If you would like to read more, look at Introduction to Quantum Mechanics by David Griffiths, or 

check out a lab write-up by Philip Ilten on the Zeeman effect at http: //sever ian.mit . edu/philten/ 

physics/zeeman.pdf . 
A lab report for observing the galactic plane can be found at http://severian.mit.edu/philten/ 

physics/radio .pdf . 
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red. 4 

So how exactly does the process of fluorescence occur? The answer is rather com- 
plicated, and first we need some more theory. To begin, we will look at a simple one- 
dimensional quantum system in the simple harmonic oscillator. This bit of the 
chapter is not necessary to understand fluorescence (nor are the next two sections) so 
don't panic if it seems a little complicated. Next we will apply this theory (loosely) to 
the hydrogen atom, and then see how energy splittings of the Bohr energy levels can 
occur. Finally, we will explore how electrons in molecules can undergo non-radiative 
transitions, which can lead to fluorescence. 

15.1 Simple Harmonic Oscillator 

Very few quantum mechanical systems can be explicitly solved, but one that can is the 
case of simple harmonic oscillation which we explored classically in Chapter 6. If 
you are a little rusty on the classical derivation, go take a look at Chapter 6 briefly, at 
is it will help with the following theory (although not entirely necessary). The force on 
a simple harmonic oscillator is given by Hooke's law, 

F = -kx (15.1) 

where F is force, k is the spring constant, and x is position. Both x and F are vector 
quantities, but because the problem is one-dimensional, we can indicate direction with 
just a positive (pointing towards the right) or negative (pointing towards the left) sign. 
The work done by a force on an object is the integral of the force over the distance 
which it was exerted or, 

W = I Fdx (15.2) 

where W is work. The change in potential energy for an object after a force has been 
applied to it over a certain distance is, 

AV(x) = - I Fdx (15.3) 

where V(x) is the potential energy of the object at position x. If we move the harmonic 
oscillator a distance x from equilibrium the potential energy of the object is, 

f x 1 1 

V(x) = F dx = -mkx = -mw x (15.4) 

V ' J X0 2 2 V > 

if we assume the object is at equilibrium for xq = 0. In the final step we have used the 
relation for k in terms of mass of the object, m, and the angular frequency, w, given 
by Equation 6.5. 



*This definition of fluorescence is very qualitative, but that is only because fluorescence can be somewhat 
difficult to define. There is another type of fluorescence called anti-Stokes fluorescence where the 
emitted photon has a larger energy than the absorbed photon. 
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15.1 Simple Harmonic Oscillator 

The reason the harmonic oscillator problem is so important is because of the potential 
above. This potential is quadratic in x, which means that if an arbitrary potential, 
V(x), is expanded about a minimum using a Taylor series expansion, the potential can 
be described very well by simple harmonic motion. 5 In other words, simple harmonic 
motion can be used to approximate much more complex systems for small oscillations 
of the system. 

So where does quantum mechanics come into all of this? In quantum mechanics, all 
particles and objects are represented by waves or wave packets. We already discussed 
this briefly before in Chapter 12 with the introduction of the de Broglie wavelength. 
In quantum mechanics, all particles must satisfy Shrodinger's equation 6 , 

Hip = Eip (15.5) 

where tp is the wavefunction for a particle, and E is the energy of the particle. By 
wavefunction what we mean is the equation that describes the shape of the particle in 
space. Oftentimes in quantum mechanics a Gaussian wavefunction is used, similar the 
shape of Figure 1.2 in Chapter 1. The square of a wavefunction gives us the proba- 
bility density function of finding the particle (think back to Chapter 1). What this 
wavefunction tells us is that the particle is not just in one place, but that it has the 
probability of being in a variety of different places. Until we actually measure where the 
particle is, it could be anywhere described by its wavefunction. 

The term H in Equation 15.5 is called the Hamiltonian and is just the potential 
energy of the particle less its kinetic energy. Mathematically, the Hamiltonian is given 
by 

h 2 h 2 r) 2 

H = V(x)-—V 2 = V(x) ^ (15.6) 

y ' 2m v ' 2m dx 2 v ' 

where the final step is the Hamiltonian for a one-dimensional system like the simple 
harmonic oscillator. The term H is the reduced Planck's constant, and is just h/2n. 
Mathematically the Hamiltonian is a special object called an operator. An operator 
is a symbol that denotes performing certain steps on whatever comes after the opera- 
tor. Operators are nothing new, as a matter of fact, 4-, which just denotes taking the 
derivative of a function, is the differential operator. What is important to remember 
about operators is that they are not necessarily (and usually are not) commutative. 
The order of operators is important and cannot be switched around. 

We can write the Hamiltonian operator in terms of two new operators, the momen- 
tum operator, p, and the position operator, x. The momentum operator is, 

d 

p = —ih—- (15.7) 

ox 



5 The Taylor expansion of V(x) about Xo is given by V(x) — V(x ) + V'(x Q )(x— x ) + V"(x )(x — Xo) + - • ■ ■ 
But if xq is at a minimum, then V (xo) = and so the first non-constant term in the expansion is 
given by the quadratic term. 
This is the time-independent Shrodinger's equation, the time-dependent equation is a bit more com- 
plicated, but we don't need to worry about that for the example of simple harmonic motion. 
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and the position operator is just x = x. Substituting the momentum operator into the 
Hamiltonian of Equation 15.6 yields, 

2 

H = V(x) + — (15.8) 

2m 

where we have used that p 2 = —ti 2 d 2 /dx 2 . 

Now we can insert the potential energy for simple harmonic oscillation, Equation 15.4, 
into the Hamiltonian above. 

tt 1 2 2 . P 2 {mux) 2 +p 2 

H = -moo x -\ = 15.9 

The numerator of this equation looks very similar to a quadratic equation, ax 2 + bx + c, 
without the final two terms. This means that we should be able to factor it into two new 
operators. At this point you might ask why on earth would we want to do that? The 
answer is read on, hopefully this step will make more sense after a few more paragraphs. 
We introduce two new operators the creation operator, a, and annihilation op- 
erator, a' which are given by, 

— ip + mujx | ip + mujx 
a = — ^=^^- a 1 = — -j^=^=- (15.10) 

\f2hmoj V2hmaj 

in terms of p and x. It turns out that we can factor the Hamiltonian of Equation 15.9 
into, 

//=//.,■ I a) a- -J (15.11) 

where a few algebraic steps have been left out. 7 If we plug this back into Shrodinger's 
equation we get, 

huj(a^a--)'ijj = Eip (15.12) 

for the simple harmonic oscillator. 

Now, hopefully this last step will explain why we went to all the trouble of factoring 
the Hamiltonian. If we operate on the wavefunction for our particle, ip with the creation 
operator, a, we get, 

H aip = (E + Hw) atp (15.13) 

which means that the wavefunction atp also fulfills Shrodinger's equation and has an 
energy of E + hco\ s This means that if we find the lowest energy wavefunction for Equa- 
tion 15.12, -00) with energy Eq, we can find every possible wavefunction that will fulfill 



7 Check the result though if in doubt! Just use Equation 15.10, and see if you can recover 15.11. 
Again, quite a few mathematical steps have been left out, but feel free to verify the result. Remember 



that the order of operators is important! 
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15.2 Hydrogen Orbitals 

Equation 15.12 by just operating on the ground wavefunction with the creation operator, 
atp°, however many times is necessary. This is why a is called the creation operator, it 
creates the next wavefunction with a larger energy than the previous wavefunction. 

The annihilation operator does the exact opposite; it finds the next wavefunction with 
a lower energy, and if applied to the grounds state, yields zero, aripo = 0. It turns out 
that the energy of the lowest ground state is Eq = hoj/2, and so from this we can find 
all the possible energies of a simple harmonic oscillator using Equation 15.13. 

E 1 = (E + huj)= 3 ^, E 2 = (E l + hu)= 5 ^, ••• (15.14) 

We can generalize this into, 

E n = (n+ - J huj (15.15) 

where E n is the energy of the n th wavefunction. 

It may have been somewhat difficult to keep track of exactly what was going on with 
all the math above but the general idea is relatively simple. We began with a system, 
simple harmonic oscillation, and found the potential energy for the system. We applied 
Shrodinger's equation to the system, and after a fair bit of mathematical rigamarole, 
determined that the energy of the simple harmonic oscillator must be quantized, and 
is given by Equation 15.15. This is the general idea of quantum mechanics. Find the 
potential and use Shrodinger's equation to determine the energy levels and wavefunctions 
for the system. The method used above is a nice algebraic one, but oftentimes a much 
more mathematically intensive power series method must be used. 



15.2 Hydrogen Orbitals 

So the previous section was rather complicated, but hopefully it gives a general idea of 
how quantum mechanics is approached. As Richard Feynman, a famous physicist, once 
said, "I think I can safely say that nobody understands quantum mechanics." But how 
does all of this apply to fluorescence? Let us again turn to the hydrogen atom, like 
we did with the Bohr model, but now use a fully quantum mechanical theory. Don't 
worry, we aren't going to solve Shrodinger's equation for a three dimensional potential, 
it's available in many quantum mechanics text books. Instead, we are going to look at 
the end result. 

The potential for a hydrogen atom is given by the electric potential, just as it was for 
the Bohr model in Equation 14.8. 

V(r) = -£*- = -±?1 (15.16) 

Using the potential above, it is possible to work out (although certainly not in the scope 
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of this chapter!) that the wavefunctions for the hydrogen atom are given by, 



, 2 y {n-l-l)\ -l ( 2r 



x I L^} A±L.\\yT 



2r\\ _„ „ (15-17) 



77 7* 1 



where there are quite a few letters that need explaining. 9 Before that however, it is 
important to realize that it is not necessary to understand Equation 15.17, but rather to 
understand its implications. The pattern of electrons around a hydrogen atom are not 
just simply in spherical orbits like Bohr's model predicted, but rather, are described by 
complicated wavefunctions! 

So, back to understanding the symbols of Equation 15.17. First, 6, (ft, and r, are just 
the standard variables for a spherical coordinate system. Next, there is no longer just 
one quantum number, n, but three! These quantum numbers are n, I, and mi and will be 
explained more in the next section. For now, just remember that < I < n, —I < mi < I, 
and that all three quantum numbers must be integers. There is only one constant in the 
equation above and that is t\ or the Bohr radius, as introduced in Chapter 14. Finally, 
there are the letters L and Y which stand for special functions called the Laguerre 
polynomials and Legendre functions. In two dimensions sine and cosine waves can 
be used to describe any function through what is known as a Fourier decomposition. 
These two types of functions serve a very similar purpose, but now for three dimensions. 

The square of Equation 15.17, as mentioned earlier for wavefunctions in general, gives 
the probability density functions for the wavefunctions of the hydrogen atom. What this 
means is that if we square Equation 15.17 for some n, I, and mi and integrate over a 
volume in the spherical coordinate system, we will know the probability of finding the 
electron within that volume. Visualizing the probability density functions for the hydro- 
gen atom can be quite challenging because we are trying to represent the probability for 
every point in three dimensional space about the hydrogen atom. The problem of visu- 
alizing the probability density functions for the hydrogen atom is very much like trying 
to determine the inside of a fruit cake, without cutting it. So perhaps the simplest (and 
most common) solution to visualizing the probability density functions of the hydrogen 
atom is to just cut the cake and look at a single slice. 

Figure 15.1 does just that; it takes a slice of the probability density functions, \ip n i m \ 2 , 
of the hydrogen atom for n = 1,2,3, all possible I, and mi = 0. The results certainly 
are interesting! In Figure 15.1, the center of each plot corresponds to the nucleus of 
the hydrogen atom. Both the x and y-axis give the distance from the nucleus in units 
of Bohr radii, r\. The color of the plot corresponds to the value of the probability 
density function, red corresponds to a high probability, while blue corresponds to a low 
probability. 



This form is taken from Introduction to Quantum Mechanics by David Griffiths. 
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Figure 15.1: Profiles of the probability density clouds describing the position of electrons 

in a hydrogen atom, given the quantum numbers n, I, and to/. 
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For the lowest energy level of the hydrogen atom, ^1,0,0' we see that the electron will 
be within a sphere of 5 Bohr radii from the center of the atom with a nearly 100% 
probability. This actually is not very different from the Bohr model, but now instead 
of a sharply defined orbit, the electron can reside anywhere within the blue cloud of 
Figure 15.1(a). The next highest energy level of the hydrogen atom, ip2,o,o, a ls° exhibits 
a similar behavior, but now it is more probable for the electron to be found farther from 
the nucleus of the atom. Thinking again in terms of the Bohr model, this makes sense, 
as we would expect higher energy levels to have larger radii for their orbits. If we now 
look at the probability density function for 1/^2,1,0, Figure 15.1(c), we see a shape that is 
not even close to looking spherical, a rather radical departure from the Bohr model. The 
same applies to the probability density functions for ^3^0 and 1/13,2,0, neither of these 
are remotely spherically shaped. 

Up until this point we have not discussed what the energy for a given wavefunction 
^n,l,mi is. Looking at Figures 15.1(d), 15.1(e), we would certainly expect that the energy 
levels of these three wavefunctions should be completely different, as their probability 
density functions certainly are. This however, is not the case, and the energy level for 
a wavefunction given by Equation 15.17 is only dependent upon the principle quantum 
number n, 

-13.6 eV 

E n = = 15.18 

n z 

and by a strange twist of fate is exactly the same as what we derived earlier using 
the Bohr model in Equation 14.9. This yields a rather interesting result, the electrons 
described by Figure 15.1(a) have an energy of —13.6 eV, Figures 15.1(b) and 15.1(c) 
both have an energy of —3.4 eV, and Figures 15.1(d), 15.1(e), and 15.1(f) all have the 
same energy of —1.5 eV despite looking completely different. 

There is one final note to make about the hydrogen atom, and that is that oftentimes 
in chemistry and certain types of physics, the probability density functions of the hy- 
drogen wavefunctions are called electron orbitals. This term is somewhat deceptive; 
the electrons do not actually orbit, but rather the probability of finding an electron 
is described by the orbital. A type of classification called spectroscopic notation is 
oftentimes used to describe the different types of orbitals, and is useful to know. The 
notation is given by nl where n is just the principle quantum number and I is also the 
quantum number from previously, but now is assigned a letter instead of a number: 
I = — > s, I = 1 —7- p, I = 2 — > d, I = 3 — > f. 10 For I greater than 3, the letters are 
assigned alphabetically. The first four letters are assigned because they stand for sharp, 
principal, diffuse, and fundemental, which apparently describe the type of spectroscopic 
line given by each orbital. Visually, s orbitals are described by a spherical shape, p or- 
bitals by a barbell shape, d by a barbell shape with a ring around it, and f by a barbell 
with a double ring (not shown in Figure 15.1). As an example of spectroscopic notation, 
Figure 15.1(a) is a Is orbital. 



The quantum number mi can also be denote as a subscript to the letter representing I, but the naming 
convention for this is rather complicated and is generally not used. 
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15.3 Energy Splitting and Quantum Numbers 

15.3 Energy Splitting and Quantum Numbers 

In the previous section we needed 3 quantum numbers to describe the electron wavefunc- 
tions for the hydrogen atom, n, I, and mi. But what exactly are these numbers? The 
quantum number n was first introduced in Chapter 14 and is the principal quantum 
number which describes the energy level of the wavefunction. The explanation for the 
quantum numbers I and mi require a bit more explanation and the introduction of two 
new quantum numbers s and m s . 

The quantum numbers I, mi, s, and m s all help describe the momentum of a particle, 
in the case of the hydrogen atom, the electron orbiting the nucleus. Going back to the 
planetary model of the Bohr model we can think of the electron as the earth and the 
nucleus as the sun. When the earth orbits the sun it has two types of angular momentum: 
angular momentum from orbiting the sun, L, and angular momentum from revolving 
on its axis, S. Similarly, electrons have orbital angular momentum, L, and spin angular 
momentum, denoted by the letter S. We define the spin and orbital angular momentum 
in terms of, 

S = h^s(s + l) (15.19a) 

L = hy/l(l + l) (15.19b) 

where s is called the spin quantum number and I is called the orbital quantum 

number. 11 Whenever spin is used in reference to an electron or particle, we are not 
referring to the spin angular momentum of the particle, but rather the spin quantum 
number. 

However, there is a bit of a problem; electrons are considered to be point-like objects, 
and subsequently the idea of spin momenta and angular momenta doesn't quite work 
the same way as it does for the earth. Specifically, the spin quantum number of an 
electron is ^. Any type of particle which carries a fractional spin, like the electron, is 
called a fermion, while any particle which carries an integer spin, such as the photon 
with spin 0, is called a boson. A particle with spin s can have a spin projection 
quantum number, m s , from — s to s in integer steps. For example, the electron can 
have a spin projection quantum number of — ^ (spin up) or +^ (spin down). The photon 
can only have a spin projection quantum number of 0. The same principle applies for 
I and 77i;; mi, the orbital projection quantum number can range from —I to I in 
integer steps. 12 The reason both m s and mi are called projection numbers is because 
they represent the projection of the orbital or spin momentum of Equation 15.19 onto 
an arbitrary axis of the particle. 

So why on earth do we need all these quantum numbers?! We just need n to describe 
the energy levels of the hydrogen atom, I helps describe the shape of the electron wave- 
functions, but mi, s, and m s all seem to be overkill. This is because unfortunately the 



Oftentimes I is called the azimuthal quantum number 
12 This quantum number is also called the magnetic quantum number because it describes the 
magnetic interaction between the electron and proton of a hydrogen atom. 
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potential energy for an electron in a hydrogen atom is not given by just Equation 15.16. 
As a matter of fact quite a few other factors that are much less significant than the 
Coulomb force must be taken into account, which require the use of all the additional 
quantum numbers introduced above. 

The first problem with Equation 15.16 is that it does not take into account the rela- 
tivistic motion of the electron or the effect of the proton's magnetic field on the electron 
(this is called spin-orbital coupling). Taking these two effects into consideration splits 
the Bohr energy levels of Equation 15.18 into smaller energy levels using mi and m s . 
This splitting of the energy levels is called the fine structure of the hydrogen atom. 
Additionally, the electric field that the electron experiences from the proton must be 
quantized, and this yields yet another energy splitting called the Lamb shift. Finally, 
the nucleus of the hydrogen atom interacts with the electric and magnetic fields of the 
electrons orbiting it, and so a final correction called hyperfine splitting must be made. 

This certainly seems like quite a few corrections to Equation 15.16, and that is because 
it is. Sometimes it is easier to think of all the corrections to the energies in terms of 
energy level splitting. The difference between Bohr energies is ~ 10 eV, while the 
difference in fine splitting is ~ 10 -4 eV, Lamb shift ~ 10 -6 eV, and hyperfine splitting 
~ 10 -6 eV. Really it is not important to remember this at all. What is important to 
remember is that the already rather complicated pattern of electron orbitals shown in 
Figure 15.1 are even more complicated. The hydrogen atom, the simplest atom we can 
look at, is not simple at all! 

15.4 Non-Radiative Transitions 

As we learned in Chapter 14, whenever an electron makes a transition from one Bohr 
energy level to another, a photon is emitted with frequency, 

E y = Ei-E f =>f= *~ f (15.20) 

where Equations 14.10 and 14.7 were combined from Chapter 14. Sometimes, however, 
electrons can transition from energy state to energy state without radiating a photon. 
This is called a non-radiative transition and can occur through a variety of mecha- 
nisms. 

The first mechanism by which a non-radiative transition can occur in an atom or 
molecule is through what is known as internal conversion. In all of the theory above, 
it has always been assumed that the nucleus of the hydrogen atom is stationary and not 
moving. In reality this is not the case (unless the temperature was absolute zero) and 
so another layer of complexity must be added to the already complex energy structure 
of the hydrogen atom. When an atom vibrates it can do so at quantized energy levels 
known as vibrational modes. These energy levels are layered on top of the already 
existing energy levels of the hydrogen atom. When internal conversion occurs an electron 
releases its energy to the atom through the form of a vibration rather than a photon. 
Usually internal conversion occurs within the same Bohr energy level (same n) because 
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of what is known as the Pranck-Condon principle. This principle states that a switch 
between vibrational modes is more likely when the initial wavefunction of the electron 
closely matches the final wavefunction of the electron, which usually occurs within the 
same Bohr energy level. 

Another possibility for non-radiative transitions is through vibrational relaxation 
which can only occur in a group of atoms or molecules. This is because the entire 
sample of molecules has vibrational energy levels (essentially the temperature of the 
sample). When an atom or molecule undergoes vibrational relaxation it releases some 
energy through the decay of an electron to a lower energy orbital which is absorbed by 
the sample and so the sample transitions to a new vibrational energy level. 

Finally, another non-radiative transition process can occur through intersystem 
crossing. In the hydrogen atom, the total spin for a group of electrons can be de- 
termined by adding spin quantum numbers, s, if their spins are pointing in the same 
direction, or subtracting spin quantum numbers if their spins are pointing in opposite 
directions. Electrons try to pair off into groups of two electrons, with spins pointing 
in opposite directions, by what is known as the Pauli exclusion principle, and so if 
there is an even number of electrons, the total spin quantum number is usually zero. 
This means that m s = 0, and since there is only one possible value for m s , this state 
is called a singlet. If there is an odd number of electrons, then generally s = 1 and so 
now m s = —1,0, 1 which is called a triplet. 

Oftentimes an atom will transition from a singlet state to a triplet state (or vice versa), 
where a small amount of energy is expended on flipping the spin of an electron. This type 
of non-radiative transition is called an intersystem crossing because the atom transitions 
from a singlet system to a triplet system and only occurs in phosphorescence or 
delayed fluorescence. In the case of phosphorescence, the electron transitions from a 
ground singlet state to an excited singlet state through absorption of a photon. Then it 
decays through non-radiative transitions to the triplet state. Finally, it decays back to 
the ground singlet state through a radiative decay. For delayed fluorescence, the same 
process occurs, but the electron transitions back to the excited singlet state from the 
triplet state before emitting a photon. Both of the processes occur over a longer time 
period because transitioning from a triplet to a singlet does not occur as rapidly as 
internal conversion or vibrational relaxation. 

Combining all of these processes with the complicated energy levels of the hydrogen 
atom (and even more complicated energy levels for molecules), makes it very difficult 
to predict the exact process through which fluorescence occurs. Oftentimes it is advan- 
tageous to visualize all the processes occurring in fluorescence with what is known as a 
Jablonksi diagram. In the example diagram of Figure 15.2 the electron begins in the 
lowest Bohr energy level, n = l. 13 Within this energy level are three vibrational modes 
of the atom, v\, V2, and v^, and the electron is at the lowest, v\. Of course, the actual 
number of vibrational modes is entirely dependent upon the fluorescing substance. The 



13 The notation for fluorescence states is completely different from either the hydrogen wavefunctions 
using the notation ip n j t7ni and spectroscopic notation. This is because it is important to differentiate 
between singlet and triplet states of the atom or molecule when dealing with fluorescence. For this 
example however, we will stick with our standard notation from before. 
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Figure 15.2: A Jablonski diagram for an example fluorescence process. 



electron absorbs a photon (the blue squiggly incoming line) with energy E a which boosts 
the electron into the next Bohr energy level, n = 2, and into the fourth vibrational mode, 
V4, of this energy level. This excitation of the electron is denoted by the solid arrow and 
occurs over a time scale of 10 -15 s. Next, the electron non-radiatively transitions from 
V4 to v\ of the energy level n = 2. This is given by the dotted line and occurs over a 
time period of 10 -12 s. Finally the electron transitions to V3 of the n = 1 energy level, 
the solid black arrow, over a time period of 10 -9 s and emits a photon with energy E e , 
the squiggly red line, where E e < E a . 



15.5 Experiment 

In the Figure 15.2 the absorption of the photon and non-radiative decay of the electron 
both occur rapidly in comparison to the final decay of the electron. This is because after 
the electron decays non-radiatively, it is in a metastable state, or a state that has a 
longer lifetime than the previous two processes. The electron while in the metastable 
state cannot decay non-radiatively, as it is in the lowest vibrational mode of the energy 
level, and so it must decay radiatively to the next Bohr energy level. The number of 
electrons within the metastable state is dependent upon the rate at which the electrons 
are able to radiatively decay. The change in number of electrons is, 



dN 



-XNdt 



15.21) 



where N is the number of electrons at a given time t, and A is the decay rate, or fraction 
of electrons that will decay. More details on how to solve this type of differential equation 
are given in Chapter 16 for the absorption of /3-rays. 

The general idea however is to integrate both sides and set initial conditions, which 
results in, 



N = N e 



-\t 



(15.22) 



where again A is the decay rate and Nq is the initial number of electrons in the 
metastable state. The mean lifetime of a particle within the metastable state is then 
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given by, 

r = - (15.23) 

A 

and is the expected time an electron would stay within the metastable state. In Figure 
15.2 the mean lifetime of the electron in the metastable state is 10 -9 seconds, and so 
A w 10 9 s" 1 . 

Because the number of electrons decaying is directly proportional to the number of 
photons being emitted, the intensity is directly proportional to Equation 15.22. In the 
experiment for this chapter, a strobe light flashes on two different types of fluorescent 
crystals and excites electrons within the crystals rapidly. The electrons then decay over 
a short time period into a metastable state through non-radiative decays. Then, over a 
longer time period (although short to the human eye) these electrons decay radiatively. 

A photo-diode is placed in front of the fluorescing crystals and so after the strobe goes 
off, it is able to determine the intensity of the light being emitted from the crystals. 
This photo-diode is hooked into an oscilloscope where a plot of intensity on the y-axis 
is made against time on the x-axis. By taking data points from this curve, it is possible 
to determine a value for A and subsequently for r, the mean lifetime of the electrons in 
the crystal. Remember that Equation 15.22 is exponential, so the trick of making the 
equation linear, outlined in Chapter 10, can be applied for better results. 



171 



16 Beta Radiation 



Take a scrawny nerd, add a dash of a scientific experiment, and bombard the mixture 
with a large amount of radiation; this is the recipe for a superhero. 1 Thanks to popular 
culture, radiation has a haze of misinformation surrounding it, so let us strip away 
any pre-conceived ideas and try to start back at the beginning with the definition of 
radiation. Radiation is literally the emission of rays, but this is not a very scientifically 
precise definition, and so let us instead define radiation as the transfer of energy through 
a medium or vacuum by means of waves or subatomic particles. 2 By this definition sound 
is radiation, as is well electromagnetic waves, and so we are constantly being bombarded 
by radiation of many different types, yet not very many of us are turning into superheros. 
That is because the word radiation is oftentimes used to indicate a more specific type of 
radiation, ionizing radiation, where the wave or subatomic particle has enough energy 
to ionize, or strip the electrons away from atoms. It is this type of radiation that can 
be harmful to humans, and has captured the imagination of the public. 

There are three types of ionizing radiation, a-radiation, /3-radiation, and 7-radiation. 
An a-particle is a helium nucleus consisting of two neutrons and protons, and is liberated 
from excited heavy nuclei. The /3-ray is an electron emitted from an excited neutron 
transitioning to a proton, while the 7-ray is just a high energy photon, or light wave. 
Because a-rays are so much larger than the other two types of radiation, they cannot 
penetrate objects as easily and can be stopped by a piece of paper. On the other hand, 
/3-rays can be blocked by a sheet of metal, and 7-rays such as x-rays can require a few 
centimeters of lead. 3 This does not, however, indicate the danger of each type of radia- 
tion to the human body. For example, a-particles can cause severe skin damage, while 
/3-rays from a mild source cause no significant tissue damage. 

16.1 Classical Beta Radiation 

The process of /3-radiation has been known for over 100 years, yet the implications of 
the process are still being debated at the cutting edge of particle physics to this day. 
Let us first consider the process of /3 radiation where a neutron turns into a proton and 



The two most famous being Spiderman and the Incredible Hulk, and possibly Captain America, 
although his origins are somewhat up to debate due to comic book censorship. 

It is important to make the distinction of subatomic particles, otherwise we could consider rain to be 

a form of water radiation. 
' Think of this as the particles trying to pass through some barrier with holes in it. Helium nuclei are 
huge in comparison to electrons, and so it is difficult for them to pass. Electrons are much smaller, 
and photons are even smaller still (although technically electrons are considered to be point-like, and 
subsequently are as small as you can get). 
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an electron. 

n —> p + + e~ (16.1) 

In particle physics, mass is not conserved (although energy and momentum are), but a 
lone particle cannot decay into particles with more mass. Here, the neutron has a mass 
of 939.6 MeV/c 24 , while the proton has mass of 938.3 MeV/c 2 and the electron has a 
mass of 0.5 MeV/c 2 , so the decay of the neutron is allowed. 

If we take Equation 16.1 and impose conservation of energy and momentum, and use 
Einstein's famous relation between energy, momentum, and mass, we can find the energy 
of the electron from the decay. Einstein's relation states, 

m 2 c 4 = E 2 - p 2 c 2 (16.2) 

or that the mass squared of an object is equal to the energy squared of the object less 
the momentum squared. If we let the momentum of the object equal zero, we obtain the 
famous equation E = mc 2 . Returning to the problem at hand, let us consider the decay 
of a neutron at rest so E n = m n c 2 and p n = 0. Now, when the neutron decays, we know 
that the electron will have a momentum of p e and the proton will have a momentum of 
p p , but by conservation of momentum p e + p p = p n = and so p e = —p p - Additionally, 
by conservation of energy, we can write, 

E n = E p + E e (16.3) 

where E p is the energy of the proton and E e the energy of the electron. 

Using Einstein's relation of Equation 16.2 again, we can write out the energies of the 
proton and electron in terms of momentum and mass. 

rp2 2422 7^2 2422 / 1C ,\ 

E p = m,pC - p p c , E e = m e c - p e c (16.4) 

Substituting the proton energy into Equation 16.3 along with the neutron energy, m n (P, 
yields a relation between the momenta, particle masses, and energy of the electron. 



m n c = Jm 2 c A — p 2 c 2 + E e (16.5) 

Next, we solve for the proton momenta in terms of the electron energy and mass by 
conservation of momentum, 

2 2 2 2 2 4 7^2 / 1CC \ 

P p c =P e c =m e c -E e (16.6) 

and plug this back into Equation 16.5 to find a relation with only the masses of the 
neutron, proton, and electron, and the energy of the electron. 



m n c 



2 = Jm 2 c A - m 2 c 4 + El + E e (16.7) 



4 In particle physics we give mass in mega-electron volts over the speed of light squared, MeV/c 2 , but 
oftentimes drop the c 2 for convenience. One MeV is equal to 1.8 x 10 -30 kilograms. 
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Finally, we solve for the energy of the electron. 



24, 7^2 <-)Z7 2 24 2 4 , P 2 

m„c + E e — 2E e m n c = m n c - m p c + E p 



E e 



p 

2 4 , 2 4 2 4 

m„c + m e c — rripC 



2m„c 2 



, ,, ir 2 , _2 _2 



., . m; + m^ - m; \ (16.8) 



2m n 
E e an 1.30 MeV 



Plugging the values for the mass of the neutron, proton, and electron, yield the last 
line of Equation 16.8 and provide a very experimentally verifiable result. The electrons 
produced from the decay of a neutron at rest should have an energy of 1.30 MeV or 
2.1 x 10 -13 J. The only problem with this theoretical result is that when the experiment 
is performed the results do not match our theoretical prediction at all! As a matter of 
fact, the electrons observed from /3-decay have a wide range of energies ranging from 
near 0.5 MeV up to 1.30 MeV. So what is the problem with our theory? 

When this problem with /3-decay was first discovered in the 1930's, some physicists 
wanted to throw out conservation of energy, just as conservation of mass had been thrown 
out by Einstein. However, Paul Dirac and Enrico Fermi proposed a simple and brilliant 
alternative. The decay of Equation 16.1 is wrong, and there should be an additional 
neutral particle on the right hand side. Additionally, since electrons had been observed 
up to energies of Equation 16.8, the particle must be extremely light. Considering a 
three-body decay instead of a two-body decay completely changes the theory, and 
provides a prediction where the energy of the electron is not fixed, but dependent upon 
how much momentum is imparted to the other two particles of the decay. 

16.2 The Standard Model 

So what is this missing particle? There is a long and interesting history behind its 
discovery, but we will skip the history lesson and move onto what is currently known. The 
missing particle is an anti-electron neutrino, is very light, has no charge, and interacts 
only through the weak force. The statement above requires a bit of explanation, so let 
us begin with what is known as the Standard Model. The Standard Model is a theory 
developed over the past century which describes the most fundamental interactions (that 
we know of) between particles. The theory itself can be rather complicated and consists 
of local time dependent quantum field theory. Luckily, understanding the results of the 
Standard Model does not require any idea as to what the last sentence meant. 

There are three forces described by the Standard Model, the electromagnetic force, 
the weak force, and the strong force. These forces are mediated or carried out by 
bosons. The electromagnetic force is mediated through the photon, while the weak 
force is mediated through the weak bosons and the strong force is mediated through 
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Figure 16.1: A summary of the fundamental particles and forces comprising the Standard 
Model. Each box contains the particle name, the particle symbol, and the 
particle mass, excluding the neutrinos for which the masses are unknown. 
Mass data was taken from the Particle Data Group. 



the gluon. More generally, bosons are particles with an integer spin 5 , but these bosons 
are special because they carry the fundamental forces and are themselves fundamental 
particles. 6 The list of forces above however is missing a critical component, gravity! This 
is because gravity is much weaker than the three forces above at the particle physics scale, 
and so we don't yet really know how it works. Incorporating gravity into the Standard 
Model is still an open question. 

The right side of Figure 16.1 summarizes the three forces and their boson mediators 
as described above. On the left of Figure 16.1 are listed the 12 fundamental fermions, 
or particles with half integer spin, of the Standard Model. The fermions are further 
split into two groups, quarks and leptons. The quarks interact through the strong, 
weak, and electromagnetic forces, while the leptons only interact through the weak and 
electromagnetic forces. The quarks are divided into three generations by mass (and by 
discovery) and have either charge +2/3 or —1/3. Quarks are bound together into groups 
called hadrons by the strong force. There are two types of hadrons, baryons consisting 
of three quarks, and mesons consisting of two quarks. The proton and neutron are both 
hadrons and baryons, where the proton is made up of two up quarks and one down quark 



5 See Chapters 14 and 17 for more details. 

6 For example, the Cooper pairs of the BCS theory in Chapter 17 are also bosons, but are neither 
fundamental bosons nor carry a fundamental force. 
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and the neutron is made up of one up quark and two down quarks. The LHC is a Large 
Hadron Collider because it collides protons with protons. 

The leptons do not interact through the strong force like the quarks, and subsequently 
do not bind together like the quarks. Individual quarks are never found in nature, while 
individual leptons, such as the electron can be easily found alone. The leptons can also be 
broken down into three generations by mass with each generation containing a charged 
lepton and a neutral neutrino. The properties of the charged leptons, electrons, muons, 
and tau leptons, are well measured while the properties of the neutrinos are not. This 
is because the charged leptons interact through the electromagnetic force, and so they 
are easy to detect. The neutrinos, however, only interact through the weak force, and 
so they are very difficult to detect. Nearly 50 trillion neutrinos pass through the human 
body every second, but because they interact so weakly with matter, they have no effect. 
Experiments to detect neutrinos require huge detectors of photo-detectors buried deep 
underground to filter away extraneous particle noise. This is why /3-decay is so important 
to particle physics, even today, as it helps provide insights into the fundamental nature 
of neutrinos. 

That covers all the basics of the Standard Model, and while it is a lot to remember, 
the important things are the three forces, the difference between fermions and bosons, 
and the two types of fundamental fermions. Remembering all the masses and particle 
names can be useful, but is not necessary to understand the basic concepts behind the 
Standard Model. There is one more important detail to mention, and that is that for 
every particle there is an anti-particle with the opposite charge. These anti-particles are 
normally designated by a bar over the symbol of the particle, except for the charged 
leptons. For example an anti-up quark has a charge of +1/3 and is denoted by the 
symbol u. The anti-electron, more commonly referred to as a positron has a charge of 
+1 and is given by the symbol e + . 

16.3 Feynman Diagrams 

Now that we have the basics of the Standard Model, we can take another look at the 
/3-decay process of Equation 16.1 and include an anti-electron neutrino. 

n — » p + e~ + v e (16.9) 

If we try to perform the same momentum and energy analysis as we did above we would 
now have three unknowns, the momenta of the three particles, and only two equations, 
one relating the three momenta, and the other equating the energies. From this we can 
see that it is impossible to determine a unique solution. Clearly a new method is needed 
to approach this problem. 

One of the main features of the Standard Model is that the probability of a particle 
being produced or decaying can be calculated, given the necessary physical constants. 
The method for performing these calculations can be very tedious, consisting of perform- 
ing multiple integrals over various phase spaces and employing a variety of mathematical 
tricks. However, the physicist Richard Feynman developed a very beautiful method to 
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(a) (b) 

Figure 16.2: A possible Feynman diagram for electron scattering is given in Figure 
16.2(a), while a diagram for /3-decay is given in Figure 16.2(b). 



represent these calculations with diagrams which allow the reader to understand the 
underlying physics without needing to know the math. These graphical representations 
of fundamental particle interactions are known as Feynman diagrams. 

Take Figure 16.2(a) as an example of a Feynman diagram where two electrons pass 
near each other and are repelled. A Feynman diagram consists of three parts, incom- 
ing particles, some internal structure, and outgoing particles. In Feynman diagrams, 
fermions are drawn as solid lines with arrows pointing in the direction of the particle, 
weak bosons are indicated by dashed lines, photons are indicated by wavy lines, and 
gluons are indicated by loopy lines. 7 Time flows from left to right in Figure 16.2(a) 
and so we see there are two incoming electrons, and two outgoing electrons, with a pho- 
ton exchanged between the two in the middle of the diagram, driving the two electrons 
apart. It is important to understand that Feynman diagrams do not indicate the actual 
trajectory of the particles, and have no correspondence to actual physical position; they 
only indicate the state of the particles at a point in time. 

A vertex is wherever three or more lines connect in a Feynman diagram. There are 
six fundamental vertices that can be drawn in Feynman diagrams using the fermions 
and bosons of Figure 16.1. Six of these vertices consist of only photons and the weak 
bosons and are not of much interest because they are relatively unlikely to occur. The 
six remaining vertices are drawn in the Feynman diagrams of Figure 16.3. If a Feynman 
diagram is drawn with a vertex that is not given in Figure 16.3, it cannot happen! Of 
course, there are many other rules which dictate what can and cannot be drawn with 
Feynman diagrams, but this is the most basic rule. 

From the figures above it is clear how Feynman diagrams help visualize particle inter- 
actions, but how do they help calculate decays and production of particles? Each line of 
a diagram is assigned a momentum and polarization vector, and each vertex is assigned 
a coupling factor. The diagram is then traced over, and all the mathematical terms, each 
corresponding to a line or vertex, are put together into an integral. The integral is used 



7 Notation in Feynman diagrams does still differ from textbook to textbook, but these are the general 
conventions followed. Sometimes diagrams are drawn with time flowing from bottom to top, rather 
from left to right. An arrow pointing outwards on an incoming fermion line indicates that the fermion 
is an anti-particle. 
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(a) (b) 



(c) (d) 



(e) (f) 

Figure 16.3: The basic Feynman diagram vertices. The letter / stands for any fermion, 
i for a charged lepton, v for a neutrino, and q for quarks. The electroweak 
vertices are given in Figures 16.3(a) through 16.3(c) while the strong force 
vertices are given in Figures 16.3(d) through 16.3(f). 
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to calculate what is known as the matrix element of the process depicted in the dia- 
gram. This matrix element can then be used to calculate decays and productions using 
a relation called Fermi's golden rule. Neither the details of this process nor Fermi's 
golden rule will be given here, but suffice it to say that this is the deeper mathematical 
meaning of Feynman diagrams. 

Using the allowed vertices of Figure 16.3 and the decay process of Equation 16.9 we can 
now draw the Feynman diagram for /3-decay, given in Figure 16.2(b). To begin we have 
the one up and two down quarks of a neutron. One of the down quarks then radiates a 
W~ boson and turns into an up quark. The W~ boson then decays into an electron and 
an anti-electron neutrino. We can use the method outlined above for calculating matrix 
elements from Feynman diagrams to determine the decay probability for the neutron. 
However, without a few months of a particle physics course the steps might be a bit 
incomprehensible and the final result is presented here without derivation. 8 



PDF{E e ) « CoE e y/E* - m 2 e c 4 (c 2 (m„ - m p ) - E e f (16.10) 

Equation 16.10 gives the probability density function for finding an electron with 
energy E e from a neutron decay 9 Here, the coefficient Cq is some normalization factor 
so that the integral of the function is one over the valid range of the equation. From 
the square root we see that the minimum value E e can have is the mass of the electron, 
m e . Additionally, we notice that the maximum energy the electron can have is when 
no momentum is imparted to the neutrino, and so the decay becomes effectively the 
two body decay of Equation 16.1 with a maximum energy of 1.30 MeV. Keeping this in 
mind we can draw the distribution of the electron energy in Figure 16.3. Notice that the 
expectation value of this distribution is 1 MeV, considerably lower than the 1.3 MeV 
predicted earlier. When compared to experimental data, Equation 16.10 and Figure 16.3 
agree very well. 10 



16.4 Beta Ray Absorption 

So now we have a theory that accurately predicts the energy spectrum of electrons 
from /3-decay, but what about a theoretical model that predicts the absorption of the 
electrons? First, let us consider electrons traveling along the x-axis, where they hit some 
absorber that is perpendicular to the x-axis and parallel to the y-axis. Let us assume the 
absorber is infinitesimally thin, with thickness dx, so that the absorber is just a single 
sheet of atoms. 

Now if we consider N electrons entering the absorber, we can write that the number 



8 However, if you do want to see the full derivation, take a look at the chapter on neutron decays in 

Introduction to Elementary Particles by David Griffiths. 
For more details on probability density functions consult Chapter 1. 
See Free-Neutron Beta-Decay Half-Life by C.J. Christensen, et. al. published in Physical Review D, 

Volume 5, Number 7, April 1, 1972. 
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16 Beta Radiation 

of electrons absorbed is, 

dN = -XNdx (16.11) 

where dN is the just the change in number of electrons, A is a constant called the 
absorption coefficient, N is the number of electrons, and dx is the distance traveled 
by the electrons. Physically, this equation states that the electrons enter the absorber, 
and then are reduced by a certain percentage A, per unit distance. If the electrons then 
travel over a distance dx the total percentage by which N is reduced is just Xdx. 

Equation 16.11 is a separable differential equation, so we can separate the variables 
and integrate both sides. 



16.12) 



In the first step, the variables have just been separated, while in the second and third 
step the indefinite integrals are performed yielding the integration constants of Co and 
C\. In the final step both sides are exponentiated and all the constants are absorbed 
into C2. By letting x = we see that -/V(O) = C2, and so C2 must be the initial number 
of electrons before any absorption occurs, or Nq. 



N = N e- Xx (16.13) 

But what is the meaning behind A in Equation 16.13? As stated above, A essentially 
tells us the percentage of electrons absorbed, per unit distance. From the previous section 
on the Standard Model, we know that electrons can interact through the weak and 
electromagnetic forces, but primarily through the electromagnetic force at low energies 
like these. As electrons pass through the absorber, they are slowed down by their 
electromagnetic interactions with the atoms. Of course, the atoms are made up of very 
dense nuclei surrounded by large electron clouds 11 , so the electrons from the /3-radiation 
will mainly interact with the electron clouds of the atoms in a fashion similar to that of 
Figure 16.2(a). 

The denser the electron clouds of the absorber, the more likely an electron from 
/3-radiation will be absorbed, and so it is clear that the absorption coefficient should 
somehow depend on the electron cloud density, or p e . So what is the electron cloud 
density? Let us first assume that the absorber is one single element and none of the 
atoms are ions, so the number of electrons is equal to the number of protons, or the 
atomic number Z of the element. We now want to find the number of electrons per 
unit volume. First we can convert the number of electrons per atom into the electrons 



1 If this is unfamiliar territory, see Chapter 14. 
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per mole by multiplying Z with Avogadro's number, Na- Next, we can can convert this 
quantity into electrons per gram by dividing by the atomic weight M of the element 
which is in grams per mole. Finally, this can be converted to electrons per volume by 
multiplying the previous product with the density of the absorber, p a . 



cm 3 




electrons^ ( electrons^ ( atoms ^ ( moles ^ ( grams N 

16.14) 
(ZN A \ 



The atomic number for carbon is 6, while the atomic weight is 12.011, and so Z/M ~ 
1/2. Actually, all of the light elements up to chlorine and argon have a value for Z/M 
of nearly 1/2. Consequently, the electron density of Equation 16.14 is only dependent 
upon the density of the absorber for the lighter elements! This means that the absorption 
coefficient is linearly proportional to the density of the absorption material! However, it 
would be nice to have a slightly better understanding of what other parameters determine 
A. 

Theoretically, determining A is quite challenging as we must now consider how the 
electrons are physically absorbed by the electron clouds. We can instead, just take a 
qualitative look at the process. We have already determined how the absorber effects 
A, but how do the incoming electrons from the f3 radiation effect A? If an electron flies 
by an atom at a very high velocity (very large E e ), the electron is hardly effected, and 
passes nearly straight by. However, if the electron has a very small velocity (very small 
E e ), the electrons of the atom will cause a much more drastic change in the path of 
the electron. From this we see that A should be inversely proportional to the electron 
energy. 

Experimentally, the value for A has been determined to be 12 , 

17 , 

16.15) 



#1.14 



where E max is the maximum electron energy, which for the case of /3 radiation from free 
neutrons is just given by Equation 16.8. Most sources of f3 radiation do not consist of 
free neutrons, and so as the electrons leave the nucleus, they must fight the attractive 
electromagnetic pull of the protons. This means that the maximum energy of most 
electrons from f3 decay are well below that of Equation 16.8. The shape of the electron 
energies however is very similar to the free neutron of Figure 16.3. 



2 Taken from The Atomic Nucleus by Robley D. Evans, pages 627-629. 
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16.5 Experiment 

There are two clear predictions about /3-decay derived in the sections above. The first is 
that the emitted electrons should not be at a single energy, but rather over a spectrum 
of energies. The second prediction states that the electrons from /3-decay should be 
absorbed over distances as described by Equation 16.13, where A is directly proportional 
to the density of the absorbing material, and is given experimentally by Equation 16.15. 
Testing the first theoretical prediction is possible, but outside the scope of this book. 
The second prediction is much easier to test, and is done so in three parts. 

In the first part of the experiment for this chapter a Geiger-Muller tube is calibrated to 
yield a good count reading from a small source of /3-radiation. This is done by adjusting 
the distance of the tube from the source and changing the bias voltage across the tube. 
The source is removed and a background count is made. Next the source is reinserted, 
and a radiation count with no absorber is made. This gives the coefficient Nq in Equation 
16.13. Next, thin pieces of cardboard and mylar are placed as absorbers between the 
source and the tube. The number of layers is recorded along with rate from the tube. 
The mass per unit area of the mylar and cardboard absorbers are found by weighing the 
absorber, and dividing by the area of the absorber calculated by simple geometry. Plots 
are then made of the number of counts versus the mass per unit area of the absorbers. 
Because A is dependent only on E max and p a , the plots should provide the same result 
assuming theory is correct. 

The second part of the experiment then uses the plots just created to determine the 
mass per unit area of an irregular cardboard shape (i.e. the mass per unit area cannot 
be calculated easily using geometry) after having determined the count rate of electrons. 
The final part of the experiment recasts the data from the first part of the experiment 
in terms of count rate and distance absorbed, as described by Equation 16.13. A plot of 
this relationship is made and a value for A is determined for the mylar absorber. Using 
Equation 16.15 the maximum electron energy can be determined from the /3-radiation 
source. 
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While superconductors are rarely encountered in day-to-day life they are well known by 
the public, and not just within the physics community. Superconductors oftentimes play 
important roles in science fiction and capture readers' imaginations with their almost 
mysterious capabilities, but they are also very real and are used in current technologies 
such as Magnetic Resonance Imaging (MRI) and the Large Hadron Collider (LHC). The 
discovery of superconductors was made in 1911 by Kamerlingh Onnes, yet the theory 
behind superconductors remains incomplete to this day. This chapter provides a brief 
overview of the physical properties of superconductors along with the theory behind them 
and a more in depth look at the Meissner Effect. However, the information provided here 
just scrapes the surface of superconductivity; the experimental and theoretical research 
done in this field is extensive. 



17.1 Superconductors 

So what exactly is a superconductor? As the name implies, superconductors are very 
good at conducting; as a matter of fact, they are perfect conductors. This means 
that if a superconductor is made into a loop, and electricity is run around the loop, the 
current will continue to flow forever without being pushed by a battery or generator. Of 
course, forever is a rather strong word, and experimental physicists don't have quite that 
much patience, but it has been experimentally shown during experiments over periods of 
years, that the current flowing within a superconducting loop has not degraded enough 
to be registered by the precision of the instruments used! 1 

However, superconductors are more than just perfect conductors. They also exhibit 
a property called the Meissner effect, which states that a magnetic field cannot exist 
within the interior of the superconductor. This is an important difference between 
superconductors and perfect conductors and will be discussed more in the Meissner 
Effect section of this chapter. Just like a square is a rectangle but a rectangle is not 
necessarily a square, a superconductor is a perfect conductor, but a perfect conductor is 
not necessarily a superconductor. 

There are two fundamental properties that describe superconductors, a critical tem- 
perature T c , and a critical magnetic field (magnitude) B c . If the temperature of a 
superconductor exceeds the critical temperature of the superconductor, it will no longer 
superconduct, and will transition to a normal state. Similarly, if the superconductor is 
subjected to a magnetic field higher than the critical magnetic field, the superconductor 
will transition to a normal state. 



1 The resistance of superconductors has been shown to be less than 10 26 Q\ 
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Figure 17.1: All currently known Type I superconductors are outlined in bold. This 
figure was modified from the Wikimedia file Periodic Table Armtuk3.svg. 



Depending on the type of superconductor, the values for T c and B c can vary greatly. 
Originally, there were two known types of superconductors creatively named Type I and 
Type II superconductors. There are thirty Type I superconductors, all consisting of 
pure metals such as gallium, which has a critical temperature of T c ~ 1.1 K. In general, 
Type I superconductors have very low critical temperatures, almost all below 10 K. 
Additionally, Type I superconductors have low critical magnetic fields. For example 
gallium has a B c of m 51 Gauss. All currently known Type I superconductors are 
outlined in bold in the periodic table of Figure 17. 1. 2 

Type II superconductors consist of alloys, such as niobium and tin, and have higher 
critical temperatures, in this example T c «j 17.9 K. 3 The highest critical temperature of 
Type II superconductors is 23 K. 4 More recently, high temperature superconduc- 
tors, neither Type I or Type II, have been discovered, which, as the name implies, have 
much higher temperatures at which they can superconduct. 5 



2 Ashcroft and Mermin. Solid State Physics. Brooks/Cole. 1976. 
3 Rohlf, James William. Modern Physics from a to Z0. Wiley. 1994. 
Kittel, Charles. Introduction to Solid State Physics. Wiley. 
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'More recently being a rather relative term. Specifically, in 1988 ceramic mixed oxide superconductors 
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Figure 17.2: Primitive cell of the YBa2Cii307 superconductor used in this experiment. 
The superconducting currents flow through the planes outlined in red. 



These high temperature superconductors are ceramic crystalline substances, some with 
critical temperatures as high as T c ~ 125 K. The ceramic YBa2Cu3C>7, used in the ex- 
periment associated with this chapter, has a critical temperature of T c w 90, higher than 
the temperature of liquid nitrogen. The primitive cell crystal structure of YBa2Cu307 
is shown in Figure 17.2. The superconducting currents flow through the planes outlined 
in red. 



17.2 BCS Theory 

But how exactly do superconductors work? As mentioned previously, the theory behind 
superconductors is still developing, although there is a theory that describes supercon- 
ductivity for Type I superconductors on the level of the atom. This theory, or BCS 
theory, was developed by John Bardeen, Leon Cooper, and Robert Schrieffer during 
the 1950's for which they earned the Nobel prize in 1972. While the mathematics behind 
the theory are very involved, the physical idea behind the theory is quite elegant. 

To fully understand the theory, a few things about conductivity need to be discussed. 
The conductivity of a metal arises from the nuclei of the atoms making up the metal 
arranging themselves into an ion lattice. The outer electrons of the atoms are not 
bound tightly to the nuclei and are able to freely move about in what is known as the 
electron gas. When the electrons in the electron gas scatter off the nuclei, the electrons 
lose energy, and this is why resistance occurs in metals. 

When a metal is cooled to a very low temperature, the ion lattice of the metal becomes 
more and more like a crystal. As the electrons in the electron gas move across the lattice, 
the negative charge of the electron pulls on the positive charges of the nuclei in the lattice. 
This pulls the nuclei towards the electron as the electron moves by as shown in Figure 
17.3(a). As the electron continues to move, a vibrational wave is formed in the lattice as 
more nuclei move towards the electron and the other nuclei settle back into place. This 



were discovered and are now used in commercial applications. 
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Figure 17.3: A diagram of the forming of Cooper pairs, necessary for the BCS theory. In 
Figure 17.3(a) an electron with spin +^ moves to the right while an electron 
with spin — k moves to the left. Each electron is coupled to a phonon. In 
Figure 17.3(b), the electron on the left has coupled with the phonon of the 
electron on the right, and formed a Cooper pair with the second electron. 



wave within the lattice is called a phonon. 6 

As a second electron moves across the lattice it becomes attracted to the phonon 
trailing the first electron. The second electron is pulled into the phonon and the two 
electrons, one with spin m s = +^ and the other with spin m s = — ^ couple or join 
through the shared phonon. This pair of electrons is called a Cooper pair and can only 
form when the temperature of the metal is less than the binding energy of the Cooper 
pair. The spins of particles add together, and so the total spin of the Cooper pair is 
zero. This means that the Cooper pair is a boson. 

Once a sufficient number of Cooper pairs are formed from the electron gas, a Bose- 
Einstein condensate forms from the electrons. This means that all the Cooper pairs have 
the lowest possible energy, and so they no longer are likely to scatter off the ion lattice. 
As soon as this occurs, the metal becomes a superconductor, as the Cooper pairs can 
now move freely through the ion lattice without scattering. 

This theory explains both what the critical temperature of a Type I superconductor is, 
and its critical magnetic field. The critical temperature is reached when the Cooper pairs 
are no longer broken apart by the kinetic energy of the nuclei, or when the temperature is 
less than the binding energy of the pair. The critical magnetic field also is reached when 
the magnetic field no longer breaks apart the Cooper pairs and allows a Bose-Einstein 



The reason the name phonon is used is because this is a quantized sound wave. The prefix phon 
indicates sound, such as phonetics, and the prefix phot indicates light, such as Photos. A quantized 
light wave is called a photon, and so a quantized sound wave is called a phonon. 
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condensate to form. 

17.3 Meissner Effect 

While the explanation for the BCS theory given above is fully qualitative, it is also 
possible to derive the theory in a more mathematical fashion. While this will not be 
done here, as it would not be helpful to the discussion, there is an important result that 
needs to be discussed called the London equations which are given in Equation 17.1. 

(17.1a) 
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(17.1b) 
m e c 

Here, /, is the current flowing through the superconductor (notice that it is a vector 
quantity and has a direction!), E is the electric field in the superconductor, and B is 
the magnetic field in the superconductor. The letter e is the charge of the electron, m e 
the mass of the electron, and n s a physical constant dependent upon the material of the 
superconductor, and c is the speed of light. 

For those not familiar with the symbol V, this is called a nabla, and when the notation 
Vx is used in front of a vector, this means take the curl of the vector. To find the 
direction of the curl of a vector, wrap your right hand around in the direction the vector 
is pointing; your thumb then points in the direction of the curl. This is called the 
right hand rule. As an example, when current is flowing through a wire, there is an 
associated magnetic field. The curl of this magnetic field points in the direction of the 
current. 7 

One of Maxwell's equations, specifically Ampere's law, states that the curl of the 
magnetic field is equal to the current times the magnetic constant, hq. 

V x B = (jlqI (17.2) 

Substituting in I from this equation into Equation 17.1b yields the following differential 
equation. 

2 

V 2 £ = s — B (17.3) 

m e c 

While many differential equations do not have known solutions, this one luckily does, as 
it is just an ordinary differential equation. The solution is, 

B(x) = B e^ (17.4) 



The curl of a vector is also a vector and is explicitly calculated by taking the determinant of the matrix 
i 3 k 
-§--§-■§- where v» are the components of the vector v. 

ox ay oz L 

Vi Vj Vk 
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where, 



A = J^f- (17.5) 

and is called the London penetration depth and Bq is the magnitude of the magnetic 
field at the surface of the superconductor. The variable x is the distance from the surface 
of the superconductor. 

The math above can be a little daunting, but it is not necessary to understand the 
specifics behind it. The important result to look at is Equation 17.4. Interpreting 
this equation physically, we see that the magnetic field inside a superconductor decays 
exponentially and that past the London penetration depth, the magnetic field within a 
superconductor is nearly zero! 

This is just the Meissner effect, mentioned at the very beginning of the chapter. From 
BCS theory (and a little bit of hand waving around the math) we have managed to the- 
oretically explain why the Meissner effect occurs, on a microscopic scale. On a historical 
note, the London equations of Equation 17.1 were developed phenomenologically be- 
fore BCS theory. This means that physicists developed the London equations to model 
the Meissner effect, which they did well, but just did not know the physical reason why 
the equations were correct. 

Now that we have a mathematical understanding of the Meissner effect, it is time to 
understand the physical consequences. When a superconductor is placed in a magnetic 
field, small eddy currents begin to circulate at the surface of the superconductor. 
These currents create magnetic fields that directly oppose the external magnetic field, 
canceling it, and keeping the magnetic field at the center of the superconductor near 
zero. Because the superconductor is a perfect conductor, the eddy currents continue 
traveling without resistance and can indefinitely oppose the external magnetic field. 

This is why unassisted levitation is possible using superconductors. A ferromagnet 
is placed over the superconductor, and through the Meissner effect the superconductor 
creates a magnetic field that directly cancels the magnetic field of the ferromagnet. This 
in turn causes a magnetic repulsion that holds the ferromagnet in place against the force 
of gravity. 

It is important to remember that both the London equations, and BCS theory are 
only valid for Type I superconductors. Type II superconductors allow magnetic fields 
to pass through filaments within the material. Supercurrents surround the filaments 
in a vortex state to produce the mixed-state Meissner effect where the external 
magnetic field is not completely excluded. 

Figure 17.4 compares what happens if a normally conducting object is subjected to an 
external magnetic field and then transitions to a perfect conductor, Type I superconduc- 
tor, or Type II superconductor. In the first scenario, Figure 17.4(a), the magnetic field 
remains within the perfect conductor. No matter what external magnetic field is now ap- 
plied to the perfect conductor, the internal magnetic field will remain exactly the same. 
If, for example, the external magnetic field were shut off, eddy currents would continue 
to produce the exact same magnetic field within the center of the perfect conductor. 
This phenomena is known as perfect diamagnetism. 
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Figure 17.4: Comparison of the magnetic field lines passing through a perfect conductor, 
Type I superconductor, and Type II superconductor above and below the 
critical temperature. The red lines indicate eddy currents which oppose the 
external magnetic field. 
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In the second scenario, the external magnetic field is completely excluded from the 
interior of the Type I superconductor. Any change within the external magnetic field 
will trigger eddy currents that directly oppose the external magnetic field. In the final 
scenario, most of the magnetic field is excluded, yet some magnetic field is still able to 
pass through the filaments of the Type II superconductor. 

17.4 Experiment 

The experiment for this chapter uses the Meissner effect to measure the critical temper- 
ature of a high temperature YBa2Cu307 superconductor. The apparatus used consists 
of a superconductor, around which is wrapped a solenoid. 8 The experiment is broken 
into two steps. 

In the first step, the solenoid is cooled to a temperature of ~ 70 K using liquid nitrogen. 
The resistance of the solenoid is measured using an ohmmeter for different temperatures 
of the solenoid (determined by a thermocouple near the solenoid). Because resistance is 
caused by electrons scattering off energetic nuclei, we expect the resistance of the coil 
to decrease as the temperature decreases. This portion of the experiment has nothing 
to do with superconductors. 

For the second step we need just a little more theory. When an external magnetic 
field is applied to a conductor, it tries to keep the magnetic field within itself the same, 
just like with the perfect conductor. The conductor does this by creating eddy currents 
which create magnetic fields that oppose the external magnetic field. Unlike the case 
of the perfect conductor, these eddy currents decay over time due to resistance and 
eventually the conductor succumbs to the external magnetic field. This whole process is 
called inductance and is measured by a unit called the henry. The longer a conductor 
fights the external magnetic field, the larger the inductance. 

If we drive an alternating current through the solenoid, we can measure the inductance 
using the equation, 




L = \(-r -R 2 (17-6) 



where uj is the frequency of the alternating current, L the inductance of the solenoid, V 
the root mean square (RMS) voltage in the solenoid, / the RMS current in the solenoid, 
and R the resistance of the solenoid at that temperature. For those uncomfortable with 
this equation just being handed down from on high, try to derive it. The process is not 
the simplest, but with a little effort and thinking it can be done. Again, neglecting the 
derivation, the inductance for a solenoid is, 

L = ^ (17.7) 



A solenoid is just a circular coil of wire. The beauty of solenoids is that they produce relatively uniform 
magnetic fields in their centers. 
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where jjl is the magnetic permeability of the core of the solenoid, N the number of turns 
of wire in the solenoid, A the cross-sectional area of the solenoid, and (. the length of the 
solenoid. 

Looking at Equation 17.7, we see that for a large magnetic permeability, the inductance 
of the solenoid is very large, while for a small permeability, the inductance is small. The 
core of the solenoid in our experimental setup is just the superconductor, which when 
at room temperature, has a relatively normal value for fj,, and so the inductance of 
the solenoid will be relatively large. However, below the critical temperature of the 
superconductor, the Meissner effect takes hold, and the magnetic field can no longer 
pass through the core of the solenoid. Essentially, fj, has become zero. This means that 
the inductance of the solenoid will drop to nearly zero. 

Using Equation 17.6, it is possible to calculate the inductance for the solenoid using the 
resistance of the coil, determined in step one, along with measuring the RMS current and 
RMS voltage passing through the coil for various temperatures. From the explanation 
above, we expect to see a dramatic jump in inductance at some point in the graph where 
the core of the solenoid transitions from a superconducting state to a standard state. By 
determining this jump, we have found the critical temperature of the superconductor! 
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Terms 



a-radiation, 173 
/3-radiation, 173 
7-radiation, 173 

absorption coefficient, 182 
adding in quadrature, 22 
amplitude, 101, 102 
angular acceleration, 47, 59 
angular frequency, 46, 56, 102, 160 
angular kinetic energy, 49 
angular momentum, 49 
angular position, 45 
angular velocity, 46, 102 
annihilation operator, 162 
anti-Stokes fluorescence, 160 
apparatus uncertainty, 10 
atomic number, 182 
atomic orbital model, 149, 159 
azimuthal quantum number, 167 

Babinet's principle, 137 
Balmer series, 155 
baryons, 176 
BCS theory, 187 
binding energy, 153 
Biot-Savart law, 86 
black-body radiation, 144 
Bohr model, 149, 159 
Bohr radius, 152 
Boltzmann distribution, 65 
Boltzmann's constant, 145 
boson, 167 
Boyle's law, 63 
Brewster's angle, 123 

calibration error, 10 
capacitance, 72 
cardinal points, 117 
Cartesian coordinates, 45 
central limit theorem, 17 
centripetal force, 150 
charge density, 140 



Charles' Law, 63 

chi-squared, 30 

circular polarization, 120 

commutative, 161 

complete constructive interference, 126 

complete destructive interference, 126 

compound pendulum, 58 

concave lens, 112 

conservation of momentum, 39 

convex lens, 112 

Cooper pair, 188 

correlation coefficient, 20 

Coulomb force, 83, 150 

couple, 188 

creation operator, 162 

critical magnetic field, 185 

critical temperature, 185 

curl, 141, 189 

current, 71 

current density, 141 

de Broglie wavelength, 137, 161 

decay rate, 170 

delayed fluorescence, 169 

difference measurement, 79 

differential form, 139 

differential operator, 161 

diffracted, 109 

diffraction, 132 

diffraction gratings, 156 

divergence, 140 

double slit, 125 

eddy current, 94 
eddy currents, 190 
elastic, 40 

electric constant, 140, 150 
electric dipole, 90 
electric dipole moment, 92 
electromagnetic force, 139 
electromagnetism, 83 
electron gas, 187 
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electron orbitals, 166 
electron volts, 153 
entropy, 65 

equations of motion, 36 
equilibrium, 55 
error band, 27 
error bar, 26 
expectation value, 15 
extremum uncertainty, 13 

far-field, 129 
Faraday's law, 87 
Fermat's principle, 109 
Fermi's golden rule, 180 
fermion, 167 
ferromagnetic, 86 
Feynman diagrams, 178 
field lines, 94 
filaments, 190 
fine structure, 168 
first harmonic, 103 
first overtone, 103 
fluorescence, 159 
focal length, 112 
focal point, 112 
force diagrams, 36 
Fourier decomposition, 164 
Franck-Condon principle, 169 
free-body diagrams, 36 
Fresnel diffraction, 138 
fundamental, 103 

Gay-Lussac's Law, 63 
geometrical optics, 109 
gluon, 176 

hadrons, 176 

Hamiltonian, 161 

hat, 87 

henry, 192 

high temperature superconductors, 186 

Hooke's law, 55, 160 

Huygens principle, 129 

hyperfine splitting, 159, 168 



ideal gas, 65 
ideal gas constant, 64 
ideal gas law, 64 
incidence angle, 156 
inductance, 72, 83, 192 
induction, 88 
inelastic, 40 

inherent uncertainty, 10 
integral form, 139 
interference, 125 
interference patterns, 156 
internal conversion, 168 
intersystem crossing, 169 
ion lattice, 187 
ionizing radiation, 173 
isochronism, 58 

Jablonksi diagram, 169 

kinematics, 39 
Kirchhoff 's laws, 74 

labels, 25 

Lagrangian mechanics, 36 

Laguerre polynomials, 164 

Lamb shift, 168 

Larmor formula, 151 

law of reflection, 111 

laws of motion, 35 

legend, 26 

Legendre functions, 164 

lensmaker's equation, 120 

Lenz's law, 87 

linear motion, 59 

linear position, 45 

linear superposition, 91 

London equations, 189 

London penetration depth, 190 

Lorentz force, 83 

Lyman series, 155 

magnetic constant, 86, 141 
magnetic dipole moment, 92 
magnetic monopole, 92, 141 
magnetic quantum number, 167 
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matrix element, 178 

matrix method, 114 

Maxwell's equations, 74, 139 

mean, 15 

mean lifetime, 170 

mediated, 175 

Meissner effect, 185 

mesons, 176 

met ast able, 170 

mixed-state Meissner effect, 190 

moment of inertia, 47, 59 

momentum operator, 161 

monochromatic, 129 

Monte Carlo, 19 

near-field, 138 

neutrino, 175 

Newton's second law, 39 

Newtonian mechanics, 35 

nodal points, 117 

nodes, 103 

non-radiative transition, 168 

normal, 88, 111, 140 

normal distribution, 16 

normal uncertainty, 14 

normalized, 15 

number of degrees of freedom, 30 

Ohm's law, 72 

operator, 161 

optical axis, 112 

orbital projection quantum number, 167 

orbital quantum number, 167 

over-fit, 31 

parallel axis theorem, 47 

parallel circuits, 74 

paraxial, 116 

partial, 40 

partial derivative, 20 

Paschen series, 155 

path integral, 86 

Pauli exclusion principle, 169 

peak amplitude, 103 

perfect conductors, 185 



perfect diamagnetism, 190 

period, 57, 102 

phase, 126 

phenomenologically, 190 

phonon, 187 

phosphorescence, 169 

photoelectric effect, 142, 152 

photoelectric experiment, 135 

photons, 135, 143 

physical optics, 109 

Planck constant, 151 

Planck's constant, 139 

Planck's law, 145 

plane polarization, 120 

plum-pudding model, 149 

point-like, 167 

polar coordinates, 45 

positron, 177 

power series method, 163 

Poynting vector, 142 

precision, 9 

principal points, 117 

principal quantum number, 151, 167 

principle quantum number, 159 

probability density function, 15, 161 

propagate, 11 

quadrature, 29 

quanta, 143 

quantized, 151 

quantum mechanics, 35, 159 

ray diagrams, 112 

real image, 112 

recast, 28 

reduced chi-squared, 31 

reduced Planck's constant, 161 

reference frame, 83 

reflected, 109 

refracted, 109 

refraction angle, 156 

refractive index, 110 

relative uncertainty, 13 

relativistic quantum mechanics, 35 
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relativistic quantum theory, 159 

relativity, 35, 84, 159 

residual, 29 

resistance, 72 

restoring force, 55 

right hand rule, 85, 189 

rigid body, 58 

root mean square (RMS) amplitude, 103 

rotational motion, 59 

Rutherford model, 149 

Rydberg constant, 154 

scattering, 149 

second harmonic, 103 

self-induced, 89 

series circuits, 74 

Shrodinger's equation, 161 

significant figures, 9 

simple harmonic motion, 55 

simple harmonic oscillation, 160 

simple harmonic oscillator, 160 

simple pendulum, 57 

singlet, 169 

slope-intercept form, 28 

small angle approximation, 57 

Snell's law, 111, 156 

spectroscopic notation, 166 

spectroscopy, 152 

spin projection quantum number, 167 

spin quantum number, 167 

spin-orbital coupling, 168 

spring constant, 55, 160 

standard deviation, 15 

Standard Model, 175 

standing wave, 103 

state variables, 64 

statistical mechanics, 63 

stopping potential, 143 

strong force, 175 

synchrotron radiation, 151 

systematic uncertainty, 9 



thin lens approximation, 112 

thin lenses, 112 

three-body decay, 175 

title, 25 

torque, 50, 59 

total, 40 

total internal reflection, 124 

transition series, 155 

translated, 109 

transverse, 99 

triplet, 169 

two-body collision, 42 

Type I, 186 

Type II, 186 

uniform distribution, 17 
unit vector, 89 
unit vectors, 87 

variance, 15 

variance-covariance matrix, 20 

velocity, 101 

velocity selector, 85 

vertex, 178 

vibrational energy levels, 169 

vibrational modes, 168 

vibrational relaxation, 169 

virtual image, 112 

voltage, 71 

vortex state, 190 

wave packets, 161 
wave-particle duality, 135, 144 
wavefunction, 161 
wavelength, 101 
weak bosons, 175 
weak force, 175 
work function, 142 

Zeeman effect, 159 



Taylor expansion, 91 
Taylor series, 22, 161 
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